This document describes the analysis of TCR repertoires for the manuscript “Unique roles of coreceptor-bound LCK in helper and cytotoxic T cells” by Horkova et al. The analysis workflow included:
Raw data are deposited in the Sequence Read Archive (PRJNA872031). Processed files can be downloaded download from Zenodo. The Zenodo archive contains the following files, which are needed to run this script:
merged_TCR_repertoires.csvmetadata_Lck.csvcount_table_tra.csv, "count_table_trb.csv"immdata_tra.rds, immdata_trb.rdsLet’s read the required files:
#
merged_repertoires <- read.csv("merged_TCR_repertoires.csv")
md <- read.csv("metadata_Lck.csv")
counts_tra <- read_csv("count_table_tra.csv")
counts_tra$...1 <- NULL
counts_trb <- read_csv("count_table_trb.csv")
counts_trb$...1 <- NULLWe will start with processing of the file merged_TCR_repertoires.csv. This file contains outputs of MiXCR software, i.e., all the assembled CDR3 clonotypes, their counts in each sample, nucleotide and amino-acid sequences, all V, D and J alignments with scores, together with some meta-data. the column num_id contains info about the sample number, which we will frequently use to attach meta-data to sample numbers.
Here, we summarize the variables in the file merged_TCR_repertoires.csv:
ExpData(data=merged_repertoires, type=2)## Index Variable_Name Variable_Type Sample_n Missing_Count Per_of_Missing
## 1 1 cloneId integer 1001983 0 0.000
## 2 2 cloneCount integer 1001983 0 0.000
## 3 3 cloneFraction numeric 1001983 0 0.000
## 4 4 targetSequences character 1001983 0 0.000
## 5 5 targetQualities character 1001983 0 0.000
## 6 6 allVHitsWithScore character 1001983 0 0.000
## 7 7 allDHitsWithScore character 566670 435313 0.434
## 8 8 allJHitsWithScore character 1001983 0 0.000
## 9 9 allCHitsWithScore character 955 1001028 0.999
## 10 10 allVAlignments character 1001983 0 0.000
## 11 11 allDAlignments character 566670 435313 0.434
## 12 12 allJAlignments character 1001983 0 0.000
## 13 13 allCAlignments character 144 1001839 1.000
## 14 14 nSeqCDR3 character 1001983 0 0.000
## 15 15 minQualCDR3 integer 1001983 0 0.000
## 16 16 nSeqFR4 logical 0 1001983 1.000
## 17 17 aaSeqCDR3 character 1001983 0 0.000
## 18 18 num_id integer 1001983 0 0.000
## 19 19 chain character 1001983 0 0.000
## 20 20 Mouse_strain character 1001983 0 0.000
## 21 21 Cell_type character 1001983 0 0.000
## 22 22 Organ character 1001983 0 0.000
## 23 23 Cell_count numeric 1001983 0 0.000
## 24 24 Exp integer 1001983 0 0.000
## 25 25 INDEX_i7 integer 1001983 0 0.000
## 26 26 INDEX_i5 integer 1001983 0 0.000
## No_of_distinct_values
## 1 68138
## 2 120
## 3 994
## 4 823904
## 5 746125
## 6 93452
## 7 327
## 8 6929
## 9 122
## 10 14932
## 11 13102
## 12 17988
## 13 1
## 14 823904
## 15 70
## 16 0
## 17 541554
## 18 28
## 19 2
## 20 3
## 21 2
## 22 2
## 23 28
## 24 3
## 25 12
## 26 3
First, we will generate summary count tables for TRA and TRB CDR3 sequences (these tables are provided as Supplementary table S1 in the manuscript). We will use pivot_wider function and we will change the num_id of the samples to meaningful names.
Here, we show the resulting table for TRA:
count_table_tra <- merged_repertoires %>%
dplyr::filter(chain == "TRA") %>%
dplyr::select(aaSeqCDR3, num_id, cloneCount)
count_table_tra2 <- count_table_tra %>%
pivot_wider(names_from = num_id, values_from = cloneCount, values_fill = 0, values_fn = sum)
md2_tra <- merged_repertoires %>% filter(chain == "TRA") %>% group_by(aaSeqCDR3) %>% slice_head(n = 1)
excel_count_table_tra <- counts_tra %>% left_join(md2_tra %>% select(allVHitsWithScore, allDHitsWithScore, allJHitsWithScore))
md_to_join <- md %>% mutate(new_name = paste(Cell_type, Organ, Mouse_strain, paste0("Exp0",Exp))) %>% select(num_id, new_name) %>% arrange(num_id)
order <- match(as.numeric(colnames(counts_tra)[2:29]), md_to_join$num_id)
colnames(excel_count_table_tra)[2:29] <- pull(md_to_join, new_name)[order]
excel_count_table_tra2 <- excel_count_table_tra %>% dplyr::select(1, 31, 32, 30, 22, 12, 25, 11, 6, 26, 13, 18, 4, 28, 17, 14, 8, 23, 9, 7, 16, 24, 2, 21, 5, 29, 10, 20, 19, 3, 15, 27)
excel_count_table_tra2 %>% head## # A tibble: 6 × 32
## aaSeqCDR3 allDH…¹ allJH…² allVH…³ CD4 L…⁴ CD4 L…⁵ CD4 L…⁶ CD4 L…⁷ CD4 L…⁸
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 CAASASSGSWQLIF <NA> TRAJ22… TRAV14… 38 24 73 30 43
## 2 CAASNMGYKLTF <NA> TRAJ9*… TRAV7-… 16 11 21 15 14
## 3 CAVSASSGSWQLIF <NA> TRAJ22… TRAV3D… 10 5 6 5 9
## 4 CAASDNYAQGLTF <NA> TRAJ26… TRAV14… 4 0 6 8 3
## 5 CALSDRYNQGKLIF <NA> TRAJ23… TRAV12… 1 1 8 0 1
## 6 CAASDDTNAYKVIF <NA> TRAJ30… TRAV14… 0 0 1 4 2
## # … with 23 more variables: `CD4 Lymph nodes WT Exp02` <dbl>,
## # `CD4 Lymph nodes WT Exp03` <dbl>, `CD4 Thymus CA Exp01` <dbl>,
## # `CD4 Thymus CA Exp03` <dbl>, `CD4 Thymus CAKR Exp01` <dbl>,
## # `CD4 Thymus CAKR Exp02` <dbl>, `CD4 Thymus CAKR Exp03` <dbl>,
## # `CD4 Thymus WT Exp02` <dbl>, `CD4 Thymus WT Exp03` <dbl>,
## # `CD8 Lymph nodes CA Exp02` <dbl>, `CD8 Lymph nodes CA Exp03` <dbl>,
## # `CD8 Lymph nodes CAKR Exp02` <dbl>, `CD8 Lymph nodes CAKR Exp03` <dbl>, …
ExpData(data=excel_count_table_tra2, type=2)## Index Variable_Name Variable_Type Sample_n Missing_Count
## 1 1 aaSeqCDR3 character 114760 0
## 2 2 allDHitsWithScore character 4769 109991
## 3 3 allJHitsWithScore character 114760 0
## 4 4 allVHitsWithScore character 114760 0
## 5 5 CD4 Lymph nodes CA Exp01 numeric 114760 0
## 6 6 CD4 Lymph nodes CA Exp02 numeric 114760 0
## 7 7 CD4 Lymph nodes CA Exp03 numeric 114760 0
## 8 8 CD4 Lymph nodes CAKR Exp02 numeric 114760 0
## 9 9 CD4 Lymph nodes CAKR Exp03 numeric 114760 0
## 10 10 CD4 Lymph nodes WT Exp02 numeric 114760 0
## 11 11 CD4 Lymph nodes WT Exp03 numeric 114760 0
## 12 12 CD4 Thymus CA Exp01 numeric 114760 0
## 13 13 CD4 Thymus CA Exp03 numeric 114760 0
## 14 14 CD4 Thymus CAKR Exp01 numeric 114760 0
## 15 15 CD4 Thymus CAKR Exp02 numeric 114760 0
## 16 16 CD4 Thymus CAKR Exp03 numeric 114760 0
## 17 17 CD4 Thymus WT Exp02 numeric 114760 0
## 18 18 CD4 Thymus WT Exp03 numeric 114760 0
## 19 19 CD8 Lymph nodes CA Exp02 numeric 114760 0
## 20 20 CD8 Lymph nodes CA Exp03 numeric 114760 0
## 21 21 CD8 Lymph nodes CAKR Exp02 numeric 114760 0
## 22 22 CD8 Lymph nodes CAKR Exp03 numeric 114760 0
## 23 23 CD8 Lymph nodes WT Exp01 numeric 114760 0
## 24 24 CD8 Lymph nodes WT Exp02 numeric 114760 0
## 25 25 CD8 Lymph nodes WT Exp03 numeric 114760 0
## 26 26 CD8 Thymus CA Exp01 numeric 114760 0
## 27 27 CD8 Thymus CA Exp03 numeric 114760 0
## 28 28 CD8 Thymus CAKR Exp02 numeric 114760 0
## 29 29 CD8 Thymus CAKR Exp03 numeric 114760 0
## 30 30 CD8 Thymus WT Exp01 numeric 114760 0
## 31 31 CD8 Thymus WT Exp02 numeric 114760 0
## 32 32 CD8 Thymus WT Exp03 numeric 114760 0
## Per_of_Missing No_of_distinct_values
## 1 0.000 114760
## 2 0.958 110
## 3 0.000 2980
## 4 0.000 31200
## 5 0.000 28
## 6 0.000 22
## 7 0.000 37
## 8 0.000 26
## 9 0.000 35
## 10 0.000 29
## 11 0.000 35
## 12 0.000 15
## 13 0.000 32
## 14 0.000 14
## 15 0.000 14
## 16 0.000 21
## 17 0.000 31
## 18 0.000 27
## 19 0.000 21
## 20 0.000 23
## 21 0.000 17
## 22 0.000 25
## 23 0.000 26
## 24 0.000 31
## 25 0.000 30
## 26 0.000 9
## 27 0.000 23
## 28 0.000 10
## 29 0.000 11
## 30 0.000 22
## 31 0.000 9
## 32 0.000 30
Here, we show the resulting table for TRB:
count_table_trb <- merged_repertoires %>%
dplyr::filter(chain == "TRB") %>%
dplyr::select(aaSeqCDR3, num_id, cloneCount)
count_table_trb2 <- count_table_trb %>%
pivot_wider(names_from = num_id, values_from = cloneCount, values_fill = 0, values_fn = sum)
md2_trb <- merged_repertoires %>%
filter(chain == "TRB") %>%
group_by(aaSeqCDR3) %>%
slice_head(n = 1)
excel_count_table_trb <- counts_trb %>%
left_join(md2_trb %>%
select(allVHitsWithScore, allDHitsWithScore, allJHitsWithScore))
md_to_join <- md %>%
mutate(new_name = paste(Cell_type, Organ, Mouse_strain, paste0("Exp0",Exp))) %>%
select(num_id, new_name) %>%
arrange(num_id)
order <- match(as.numeric(colnames(counts_trb)[2:29]), md_to_join$num_id)
colnames(excel_count_table_trb)[2:29] <- pull(md_to_join, new_name)[order]
excel_count_table_trb2 <- excel_count_table_trb %>%
dplyr::select(1, 30:32, 22, 12, 25, 11, 6, 26, 13, 18, 4, 28, 17, 14, 8, 23, 9, 7, 16, 24, 2, 21, 5, 29, 10, 20, 19, 3, 15, 27)
excel_count_table_trb2 %>% head## # A tibble: 6 × 32
## aaSeqCDR3 allVH…¹ allDH…² allJH…³ CD4 L…⁴ CD4 L…⁵ CD4 L…⁶ CD4 L…⁷ CD4 L…⁸
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 CASSDSAETLYF TRBV13… <NA> TRBJ2-… 7 1 6 1 12
## 2 CASSDAEQFF TRBV13… <NA> TRBJ2-… 0 0 5 0 1
## 3 CASSDAGYEQYF TRBV13… <NA> TRBJ2-… 0 1 2 0 3
## 4 CASSDWGGYAEQFF TRBV13… TRBD2*… TRBJ2-… 3 1 1 4 0
## 5 CASSDSGGQDTQYF TRBV13… TRBD2*… TRBJ2-… 1 0 2 1 0
## 6 CASRDRNTEVFF TRBV13… TRBD1*… TRBJ1-… 6 1 3 3 0
## # … with 23 more variables: `CD4 Lymph nodes WT Exp02` <dbl>,
## # `CD4 Lymph nodes WT Exp03` <dbl>, `CD4 Thymus CA Exp01` <dbl>,
## # `CD4 Thymus CA Exp03` <dbl>, `CD4 Thymus CAKR Exp01` <dbl>,
## # `CD4 Thymus CAKR Exp02` <dbl>, `CD4 Thymus CAKR Exp03` <dbl>,
## # `CD4 Thymus WT Exp02` <dbl>, `CD4 Thymus WT Exp03` <dbl>,
## # `CD8 Lymph nodes CA Exp02` <dbl>, `CD8 Lymph nodes CA Exp03` <dbl>,
## # `CD8 Lymph nodes CAKR Exp02` <dbl>, `CD8 Lymph nodes CAKR Exp03` <dbl>, …
ExpData(data=excel_count_table_trb2, type=2)## Index Variable_Name Variable_Type Sample_n Missing_Count
## 1 1 aaSeqCDR3 character 426800 0
## 2 2 allVHitsWithScore character 426800 0
## 3 3 allDHitsWithScore character 331672 95128
## 4 4 allJHitsWithScore character 426800 0
## 5 5 CD4 Lymph nodes CA Exp01 numeric 426800 0
## 6 6 CD4 Lymph nodes CA Exp02 numeric 426800 0
## 7 7 CD4 Lymph nodes CA Exp03 numeric 426800 0
## 8 8 CD4 Lymph nodes CAKR Exp02 numeric 426800 0
## 9 9 CD4 Lymph nodes CAKR Exp03 numeric 426800 0
## 10 10 CD4 Lymph nodes WT Exp02 numeric 426800 0
## 11 11 CD4 Lymph nodes WT Exp03 numeric 426800 0
## 12 12 CD4 Thymus CA Exp01 numeric 426800 0
## 13 13 CD4 Thymus CA Exp03 numeric 426800 0
## 14 14 CD4 Thymus CAKR Exp01 numeric 426800 0
## 15 15 CD4 Thymus CAKR Exp02 numeric 426800 0
## 16 16 CD4 Thymus CAKR Exp03 numeric 426800 0
## 17 17 CD4 Thymus WT Exp02 numeric 426800 0
## 18 18 CD4 Thymus WT Exp03 numeric 426800 0
## 19 19 CD8 Lymph nodes CA Exp02 numeric 426800 0
## 20 20 CD8 Lymph nodes CA Exp03 numeric 426800 0
## 21 21 CD8 Lymph nodes CAKR Exp02 numeric 426800 0
## 22 22 CD8 Lymph nodes CAKR Exp03 numeric 426800 0
## 23 23 CD8 Lymph nodes WT Exp01 numeric 426800 0
## 24 24 CD8 Lymph nodes WT Exp02 numeric 426800 0
## 25 25 CD8 Lymph nodes WT Exp03 numeric 426800 0
## 26 26 CD8 Thymus CA Exp01 numeric 426800 0
## 27 27 CD8 Thymus CA Exp03 numeric 426800 0
## 28 28 CD8 Thymus CAKR Exp02 numeric 426800 0
## 29 29 CD8 Thymus CAKR Exp03 numeric 426800 0
## 30 30 CD8 Thymus WT Exp01 numeric 426800 0
## 31 31 CD8 Thymus WT Exp02 numeric 426800 0
## 32 32 CD8 Thymus WT Exp03 numeric 426800 0
## Per_of_Missing No_of_distinct_values
## 1 0.000 426800
## 2 0.000 23763
## 3 0.223 214
## 4 0.000 2168
## 5 0.000 29
## 6 0.000 21
## 7 0.000 33
## 8 0.000 20
## 9 0.000 38
## 10 0.000 30
## 11 0.000 25
## 12 0.000 24
## 13 0.000 42
## 14 0.000 19
## 15 0.000 14
## 16 0.000 39
## 17 0.000 17
## 18 0.000 15
## 19 0.000 24
## 20 0.000 25
## 21 0.000 24
## 22 0.000 32
## 23 0.000 22
## 24 0.000 32
## 25 0.000 23
## 26 0.000 9
## 27 0.000 20
## 28 0.000 14
## 29 0.000 14
## 30 0.000 16
## 31 0.000 9
## 32 0.000 20
Prop table TRA
excel_count_table_tra3 <- excel_count_table_tra2 %>% mutate(nkt_trav11_traj18 = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11") & (grepl(allJHitsWithScore, pattern = "TRAJ18"))),"yes","no"))
excel_count_table_tra4 <- excel_count_table_tra3 %>% filter(nkt_trav11_traj18 == "no")
count_table_tra4 <- as.matrix(excel_count_table_tra4[,5:32])
rownames(count_table_tra4) <- excel_count_table_tra4$aaSeqCDR3
tra4_norm <- scale(count_table_tra4, center=FALSE, scale=colSums(count_table_tra4))
prop.table.tra <- cbind(tra4_norm, excel_count_table_tra4 %>% select(-starts_with("CD")) )
prop.table.tra2 <- prop.table.tra %>% dplyr::select(29, 31, 30, 32, 1:28)
prop.table.tra2 %>% head## aaSeqCDR3 allJHitsWithScore allDHitsWithScore
## CAASASSGSWQLIF CAASASSGSWQLIF TRAJ22*00(303.6) <NA>
## CAASNMGYKLTF CAASNMGYKLTF TRAJ9*00(284.4) <NA>
## CAVSASSGSWQLIF CAVSASSGSWQLIF TRAJ22*00(304.2) <NA>
## CAASDNYAQGLTF CAASDNYAQGLTF TRAJ26*00(284) <NA>
## CALSDRYNQGKLIF CALSDRYNQGKLIF TRAJ23*00(272.3) <NA>
## CAASDDTNAYKVIF CAASDDTNAYKVIF TRAJ30*00(295.7) <NA>
## allVHitsWithScore
## CAASASSGSWQLIF TRAV14N-2*00(1026.1),TRAV14D-2*00(1017.8),TRAV14D-1*00(973.4),TRAV14N-1*00(973.4),TRAV14-2*00(944.4)
## CAASNMGYKLTF TRAV7-2*00(731.5),TRAV7D-2*00(731.5)
## CAVSASSGSWQLIF TRAV3D-3*00(975.2),TRAV3N-3*00(975.2),TRAV3-3*00(944.1)
## CAASDNYAQGLTF TRAV14D-1*00(878.2),TRAV14N-1*00(878.2),TRAV14D-2*00(848.2),TRAV14N-2*00(848.2)
## CALSDRYNQGKLIF TRAV12D-3*00(1377.3),TRAV12N-3*00(1377.3)
## CAASDDTNAYKVIF TRAV14-1*00(966.7)
## CD4 Lymph nodes CA Exp01 CD4 Lymph nodes CA Exp02
## CAASASSGSWQLIF 2.727925e-03 0.0035273369
## CAASNMGYKLTF 1.148600e-03 0.0016166961
## CAVSASSGSWQLIF 7.178751e-04 0.0007348618
## CAASDNYAQGLTF 2.871500e-04 0.0000000000
## CALSDRYNQGKLIF 7.178751e-05 0.0001469724
## CAASDDTNAYKVIF 0.000000e+00 0.0000000000
## CD4 Lymph nodes CA Exp03 CD4 Lymph nodes CAKR Exp02
## CAASASSGSWQLIF 3.841094e-03 0.0027958993
## CAASNMGYKLTF 1.104972e-03 0.0013979497
## CAVSASSGSWQLIF 3.157064e-04 0.0004659832
## CAASDNYAQGLTF 3.157064e-04 0.0007455732
## CALSDRYNQGKLIF 4.209419e-04 0.0000000000
## CAASDDTNAYKVIF 5.261773e-05 0.0003727866
## CD4 Lymph nodes CAKR Exp03 CD4 Lymph nodes WT Exp02
## CAASASSGSWQLIF 2.668321e-03 0.0035501049
## CAASNMGYKLTF 8.687558e-04 0.0008068420
## CAVSASSGSWQLIF 5.584859e-04 0.0002420526
## CAASDNYAQGLTF 1.861620e-04 0.0001613684
## CALSDRYNQGKLIF 6.205399e-05 0.0000806842
## CAASDDTNAYKVIF 1.241080e-04 0.0000000000
## CD4 Lymph nodes WT Exp03 CD4 Thymus CA Exp01 CD4 Thymus CA Exp03
## CAASASSGSWQLIF 3.291020e-03 0.0028328612 0.0022079929
## CAASNMGYKLTF 1.097007e-03 0.0012876642 0.0013247958
## CAVSASSGSWQLIF 2.611921e-04 0.0005150657 0.0003679988
## CAASDNYAQGLTF 1.567152e-04 0.0000000000 0.0003679988
## CALSDRYNQGKLIF 5.223842e-05 0.0000000000 0.0000000000
## CAASDDTNAYKVIF 1.567152e-04 0.0002575328 0.0000000000
## CD4 Thymus CAKR Exp01 CD4 Thymus CAKR Exp02
## CAASASSGSWQLIF 0.0013199578 0.0037400655
## CAASNMGYKLTF 0.0013199578 0.0000000000
## CAVSASSGSWQLIF 0.0000000000 0.0023375409
## CAASDNYAQGLTF 0.0005279831 0.0000000000
## CALSDRYNQGKLIF 0.0000000000 0.0000000000
## CAASDDTNAYKVIF 0.0000000000 0.0004675082
## CD4 Thymus CAKR Exp03 CD4 Thymus WT Exp02 CD4 Thymus WT Exp03
## CAASASSGSWQLIF 0.0033652602 0.0025633010 2.775089e-03
## CAASNMGYKLTF 0.0001294331 0.0013754298 5.808325e-04
## CAVSASSGSWQLIF 0.0005177323 0.0005001563 3.872217e-04
## CAASDNYAQGLTF 0.0000000000 0.0003125977 3.872217e-04
## CALSDRYNQGKLIF 0.0001294331 0.0000000000 6.453695e-05
## CAASDDTNAYKVIF 0.0000000000 0.0003751172 1.936108e-04
## CD8 Lymph nodes CA Exp02 CD8 Lymph nodes CA Exp03
## CAASASSGSWQLIF 0.0099369387 0.0093881515
## CAASNMGYKLTF 0.0018154023 0.0020233085
## CAVSASSGSWQLIF 0.0021975922 0.0018614438
## CAASDNYAQGLTF 0.0009554749 0.0009711881
## CALSDRYNQGKLIF 0.0011465698 0.0008902557
## CAASDDTNAYKVIF 0.0012421173 0.0008093234
## CD8 Lymph nodes CAKR Exp02 CD8 Lymph nodes CAKR Exp03
## CAASASSGSWQLIF 0.0080917874 0.0096842588
## CAASNMGYKLTF 0.0019323671 0.0022348289
## CAVSASSGSWQLIF 0.0008454106 0.0020056157
## CAASDNYAQGLTF 0.0007246377 0.0009741562
## CALSDRYNQGKLIF 0.0016908213 0.0015471893
## CAASDDTNAYKVIF 0.0006038647 0.0004011231
## CD8 Lymph nodes WT Exp01 CD8 Lymph nodes WT Exp02
## CAASASSGSWQLIF 0.009602698 0.011932808
## CAASNMGYKLTF 0.002492303 0.001946123
## CAVSASSGSWQLIF 0.001905879 0.001894909
## CAASDNYAQGLTF 0.001612667 0.001177917
## CALSDRYNQGKLIF 0.003665152 0.003021612
## CAASDDTNAYKVIF 0.001319455 0.000870634
## CD8 Lymph nodes WT Exp03 CD8 Thymus CA Exp01 CD8 Thymus CA Exp03
## CAASASSGSWQLIF 0.0097454014 0.0062111801 0.0073130455
## CAASNMGYKLTF 0.0021318066 0.0014614541 0.0025163167
## CAVSASSGSWQLIF 0.0019490803 0.0014614541 0.0008649839
## CAASDNYAQGLTF 0.0012790839 0.0010960906 0.0004718094
## CALSDRYNQGKLIF 0.0024363503 0.0003653635 0.0005504443
## CAASDDTNAYKVIF 0.0007918139 0.0007307271 0.0009436188
## CD8 Thymus CAKR Exp02 CD8 Thymus CAKR Exp03 CD8 Thymus WT Exp01
## CAASASSGSWQLIF 0.0084643289 0.0065963061 0.0088170463
## CAASNMGYKLTF 0.0008061266 0.0015077271 0.0020992967
## CAVSASSGSWQLIF 0.0016122531 0.0009423294 0.0019943319
## CAASDNYAQGLTF 0.0012091898 0.0011307953 0.0015744726
## CALSDRYNQGKLIF 0.0004030633 0.0005653977 0.0016794374
## CAASDDTNAYKVIF 0.0016122531 0.0016961930 0.0007347539
## CD8 Thymus WT Exp02 CD8 Thymus WT Exp03
## CAASASSGSWQLIF 0.0053846154 0.007650815
## CAASNMGYKLTF 0.0015384615 0.001761339
## CAVSASSGSWQLIF 0.0015384615 0.001541171
## CAASDNYAQGLTF 0.0023076923 0.001431088
## CALSDRYNQGKLIF 0.0019230769 0.001155878
## CAASDDTNAYKVIF 0.0003846154 0.001265962
ExpData(data=prop.table.tra2, type=2)## Index Variable_Name Variable_Type Sample_n Missing_Count
## 1 1 aaSeqCDR3 character 114630 0
## 2 2 allJHitsWithScore character 114630 0
## 3 3 allDHitsWithScore character 4769 109861
## 4 4 allVHitsWithScore character 114630 0
## 5 5 CD4 Lymph nodes CA Exp01 numeric 114630 0
## 6 6 CD4 Lymph nodes CA Exp02 numeric 114630 0
## 7 7 CD4 Lymph nodes CA Exp03 numeric 114630 0
## 8 8 CD4 Lymph nodes CAKR Exp02 numeric 114630 0
## 9 9 CD4 Lymph nodes CAKR Exp03 numeric 114630 0
## 10 10 CD4 Lymph nodes WT Exp02 numeric 114630 0
## 11 11 CD4 Lymph nodes WT Exp03 numeric 114630 0
## 12 12 CD4 Thymus CA Exp01 numeric 114630 0
## 13 13 CD4 Thymus CA Exp03 numeric 114630 0
## 14 14 CD4 Thymus CAKR Exp01 numeric 114630 0
## 15 15 CD4 Thymus CAKR Exp02 numeric 114630 0
## 16 16 CD4 Thymus CAKR Exp03 numeric 114630 0
## 17 17 CD4 Thymus WT Exp02 numeric 114630 0
## 18 18 CD4 Thymus WT Exp03 numeric 114630 0
## 19 19 CD8 Lymph nodes CA Exp02 numeric 114630 0
## 20 20 CD8 Lymph nodes CA Exp03 numeric 114630 0
## 21 21 CD8 Lymph nodes CAKR Exp02 numeric 114630 0
## 22 22 CD8 Lymph nodes CAKR Exp03 numeric 114630 0
## 23 23 CD8 Lymph nodes WT Exp01 numeric 114630 0
## 24 24 CD8 Lymph nodes WT Exp02 numeric 114630 0
## 25 25 CD8 Lymph nodes WT Exp03 numeric 114630 0
## 26 26 CD8 Thymus CA Exp01 numeric 114630 0
## 27 27 CD8 Thymus CA Exp03 numeric 114630 0
## 28 28 CD8 Thymus CAKR Exp02 numeric 114630 0
## 29 29 CD8 Thymus CAKR Exp03 numeric 114630 0
## 30 30 CD8 Thymus WT Exp01 numeric 114630 0
## 31 31 CD8 Thymus WT Exp02 numeric 114630 0
## 32 32 CD8 Thymus WT Exp03 numeric 114630 0
## Per_of_Missing No_of_distinct_values
## 1 0.000 114630
## 2 0.000 2972
## 3 0.958 110
## 4 0.000 31170
## 5 0.000 27
## 6 0.000 21
## 7 0.000 36
## 8 0.000 25
## 9 0.000 34
## 10 0.000 28
## 11 0.000 34
## 12 0.000 12
## 13 0.000 29
## 14 0.000 11
## 15 0.000 10
## 16 0.000 18
## 17 0.000 28
## 18 0.000 25
## 19 0.000 21
## 20 0.000 23
## 21 0.000 17
## 22 0.000 25
## 23 0.000 26
## 24 0.000 31
## 25 0.000 30
## 26 0.000 9
## 27 0.000 23
## 28 0.000 9
## 29 0.000 11
## 30 0.000 22
## 31 0.000 9
## 32 0.000 30
Prop table TRB
excel_count_table_trb3 <- excel_count_table_trb2 %>%
mutate(nkt_trav11_traj18 = "no")
count_table_trb4 <- as.matrix(excel_count_table_trb3[,5:32])
rownames(count_table_trb4) <- excel_count_table_trb3$aaSeqCDR3
trb4_norm <- scale(count_table_trb4, center=FALSE, scale=colSums(count_table_trb4))
prop.table.trb <- cbind(trb4_norm, excel_count_table_trb %>% select(-starts_with("CD")) )
colnames(prop.table.trb)## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allVHitsWithScore"
## [31] "allDHitsWithScore" "allJHitsWithScore"
prop.table.trb2 <- prop.table.trb %>% dplyr::select(29, 31, 30, 32, 1:28)
prop.table.trb2 %>% head## aaSeqCDR3 allDHitsWithScore
## CASSDSAETLYF CASSDSAETLYF <NA>
## CASSDAEQFF CASSDAEQFF <NA>
## CASSDAGYEQYF CASSDAGYEQYF <NA>
## CASSDWGGYAEQFF CASSDWGGYAEQFF TRBD2*00(45)
## CASSDSGGQDTQYF CASSDSGGQDTQYF TRBD2*00(35)
## CASRDRNTEVFF CASRDRNTEVFF TRBD1*00(30)
## allVHitsWithScore allJHitsWithScore
## CASSDSAETLYF TRBV13-3*00(755.1),TRBV13-1*00(664.5) TRBJ2-3*00(245)
## CASSDAEQFF TRBV13-1*00(819.6) TRBJ2-1*00(219.2)
## CASSDAGYEQYF TRBV13-1*00(1283.8) TRBJ2-7*00(215)
## CASSDWGGYAEQFF TRBV13-1*00(676) TRBJ2-1*00(235)
## CASSDSGGQDTQYF TRBV13-3*00(886.3) TRBJ2-5*00(235)
## CASRDRNTEVFF TRBV13-3*00(979) TRBJ1-1*00(232.7)
## CD4 Lymph nodes CA Exp01 CD4 Lymph nodes CA Exp02
## CASSDSAETLYF 1.729847e-04 5.154108e-05
## CASSDAEQFF 0.000000e+00 0.000000e+00
## CASSDAGYEQYF 0.000000e+00 5.154108e-05
## CASSDWGGYAEQFF 7.413631e-05 5.154108e-05
## CASSDSGGQDTQYF 2.471210e-05 0.000000e+00
## CASRDRNTEVFF 1.482726e-04 5.154108e-05
## CD4 Lymph nodes CA Exp03 CD4 Lymph nodes CAKR Exp02
## CASSDSAETLYF 1.141010e-04 3.152287e-05
## CASSDAEQFF 9.508415e-05 0.000000e+00
## CASSDAGYEQYF 3.803366e-05 0.000000e+00
## CASSDWGGYAEQFF 1.901683e-05 1.260915e-04
## CASSDSGGQDTQYF 3.803366e-05 3.152287e-05
## CASRDRNTEVFF 5.705049e-05 9.456861e-05
## CD4 Lymph nodes CAKR Exp03 CD4 Lymph nodes WT Exp02
## CASSDSAETLYF 2.747190e-04 2.637618e-05
## CASSDAEQFF 2.289325e-05 5.275235e-05
## CASSDAGYEQYF 6.867975e-05 2.637618e-04
## CASSDWGGYAEQFF 0.000000e+00 0.000000e+00
## CASSDSGGQDTQYF 0.000000e+00 0.000000e+00
## CASRDRNTEVFF 0.000000e+00 0.000000e+00
## CD4 Lymph nodes WT Exp03 CD4 Thymus CA Exp01 CD4 Thymus CA Exp03
## CASSDSAETLYF 2.293368e-04 2.366397e-04 6.682259e-05
## CASSDAEQFF 2.293368e-05 7.887991e-05 4.454839e-05
## CASSDAGYEQYF 2.293368e-05 3.549596e-04 1.113710e-04
## CASSDWGGYAEQFF 0.000000e+00 1.577598e-04 2.227420e-05
## CASSDSGGQDTQYF 0.000000e+00 0.000000e+00 0.000000e+00
## CASRDRNTEVFF 6.880103e-05 0.000000e+00 4.454839e-05
## CD4 Thymus CAKR Exp01 CD4 Thymus CAKR Exp02
## CASSDSAETLYF 8.731718e-05 0
## CASSDAEQFF 0.000000e+00 0
## CASSDAGYEQYF 4.365859e-05 0
## CASSDWGGYAEQFF 0.000000e+00 0
## CASSDSGGQDTQYF 0.000000e+00 0
## CASRDRNTEVFF 0.000000e+00 0
## CD4 Thymus CAKR Exp03 CD4 Thymus WT Exp02 CD4 Thymus WT Exp03
## CASSDSAETLYF 0.0000000000 4.972527e-05 2.538554e-04
## CASSDAEQFF 0.0000000000 2.486263e-05 3.173193e-05
## CASSDAGYEQYF 0.0000300075 2.486263e-05 0.000000e+00
## CASSDWGGYAEQFF 0.0000000000 2.486263e-05 3.173193e-05
## CASSDSGGQDTQYF 0.0000000000 2.486263e-05 0.000000e+00
## CASRDRNTEVFF 0.0002100525 0.000000e+00 3.173193e-05
## CD8 Lymph nodes CA Exp02 CD8 Lymph nodes CA Exp03
## CASSDSAETLYF 3.846495e-04 3.699984e-04
## CASSDAEQFF 2.662958e-04 1.849992e-04
## CASSDAGYEQYF 3.254727e-04 2.642846e-04
## CASSDWGGYAEQFF 1.183537e-04 2.642846e-04
## CASSDSGGQDTQYF 5.917685e-05 5.285692e-05
## CASRDRNTEVFF 2.367074e-04 5.285692e-05
## CD8 Lymph nodes CAKR Exp02 CD8 Lymph nodes CAKR Exp03
## CASSDSAETLYF 3.195909e-04 2.544113e-04
## CASSDAEQFF 1.452686e-04 7.268895e-05
## CASSDAGYEQYF 3.776984e-04 2.544113e-04
## CASSDWGGYAEQFF 9.878265e-04 4.179614e-04
## CASSDSGGQDTQYF 2.905372e-05 7.268895e-05
## CASRDRNTEVFF 1.452686e-04 1.272057e-04
## CD8 Lymph nodes WT Exp01 CD8 Lymph nodes WT Exp02
## CASSDSAETLYF 0.0005254621 3.488707e-04
## CASSDAEQFF 0.0002741541 1.365146e-04
## CASSDAGYEQYF 0.0005483082 6.219000e-04
## CASSDWGGYAEQFF 0.0003198465 2.730293e-04
## CASSDSGGQDTQYF 0.0003198465 4.550488e-05
## CASRDRNTEVFF 0.0002056156 4.550488e-05
## CD8 Lymph nodes WT Exp03 CD8 Thymus CA Exp01 CD8 Thymus CA Exp03
## CASSDSAETLYF 0.0002286917 0.0003316475 1.856493e-04
## CASSDAEQFF 0.0001829533 0.0002487356 9.282465e-05
## CASSDAGYEQYF 0.0005031216 0.0003316475 1.856493e-04
## CASSDWGGYAEQFF 0.0002058225 0.0001658237 3.094155e-05
## CASSDSGGQDTQYF 0.0000000000 0.0000000000 3.094155e-05
## CASRDRNTEVFF 0.0001372150 0.0000000000 1.237662e-04
## CD8 Thymus CAKR Exp02 CD8 Thymus CAKR Exp03 CD8 Thymus WT Exp01
## CASSDSAETLYF 0.0000000000 0.0000000000 1.804457e-04
## CASSDAEQFF 0.0002597628 0.0002214839 3.308171e-04
## CASSDAGYEQYF 0.0000000000 0.0001661130 4.210400e-04
## CASSDWGGYAEQFF 0.0003463503 0.0002214839 2.706686e-04
## CASSDSGGQDTQYF 0.0000000000 0.0001107420 6.014857e-05
## CASRDRNTEVFF 0.0000000000 0.0000000000 1.202971e-04
## CD8 Thymus WT Exp02 CD8 Thymus WT Exp03
## CASSDSAETLYF 4.479484e-04 2.347268e-04
## CASSDAEQFF 2.687690e-04 8.535518e-05
## CASSDAGYEQYF 3.583587e-04 2.774043e-04
## CASSDWGGYAEQFF 2.687690e-04 6.401639e-05
## CASSDSGGQDTQYF 0.000000e+00 2.133880e-05
## CASRDRNTEVFF 8.958968e-05 1.066940e-04
ExpData(data=prop.table.trb2, type=2)## Index Variable_Name Variable_Type Sample_n Missing_Count
## 1 1 aaSeqCDR3 character 426800 0
## 2 2 allDHitsWithScore character 331672 95128
## 3 3 allVHitsWithScore character 426800 0
## 4 4 allJHitsWithScore character 426800 0
## 5 5 CD4 Lymph nodes CA Exp01 numeric 426800 0
## 6 6 CD4 Lymph nodes CA Exp02 numeric 426800 0
## 7 7 CD4 Lymph nodes CA Exp03 numeric 426800 0
## 8 8 CD4 Lymph nodes CAKR Exp02 numeric 426800 0
## 9 9 CD4 Lymph nodes CAKR Exp03 numeric 426800 0
## 10 10 CD4 Lymph nodes WT Exp02 numeric 426800 0
## 11 11 CD4 Lymph nodes WT Exp03 numeric 426800 0
## 12 12 CD4 Thymus CA Exp01 numeric 426800 0
## 13 13 CD4 Thymus CA Exp03 numeric 426800 0
## 14 14 CD4 Thymus CAKR Exp01 numeric 426800 0
## 15 15 CD4 Thymus CAKR Exp02 numeric 426800 0
## 16 16 CD4 Thymus CAKR Exp03 numeric 426800 0
## 17 17 CD4 Thymus WT Exp02 numeric 426800 0
## 18 18 CD4 Thymus WT Exp03 numeric 426800 0
## 19 19 CD8 Lymph nodes CA Exp02 numeric 426800 0
## 20 20 CD8 Lymph nodes CA Exp03 numeric 426800 0
## 21 21 CD8 Lymph nodes CAKR Exp02 numeric 426800 0
## 22 22 CD8 Lymph nodes CAKR Exp03 numeric 426800 0
## 23 23 CD8 Lymph nodes WT Exp01 numeric 426800 0
## 24 24 CD8 Lymph nodes WT Exp02 numeric 426800 0
## 25 25 CD8 Lymph nodes WT Exp03 numeric 426800 0
## 26 26 CD8 Thymus CA Exp01 numeric 426800 0
## 27 27 CD8 Thymus CA Exp03 numeric 426800 0
## 28 28 CD8 Thymus CAKR Exp02 numeric 426800 0
## 29 29 CD8 Thymus CAKR Exp03 numeric 426800 0
## 30 30 CD8 Thymus WT Exp01 numeric 426800 0
## 31 31 CD8 Thymus WT Exp02 numeric 426800 0
## 32 32 CD8 Thymus WT Exp03 numeric 426800 0
## Per_of_Missing No_of_distinct_values
## 1 0.000 426800
## 2 0.223 214
## 3 0.000 23763
## 4 0.000 2168
## 5 0.000 29
## 6 0.000 21
## 7 0.000 33
## 8 0.000 20
## 9 0.000 38
## 10 0.000 30
## 11 0.000 25
## 12 0.000 24
## 13 0.000 42
## 14 0.000 19
## 15 0.000 14
## 16 0.000 39
## 17 0.000 17
## 18 0.000 15
## 19 0.000 24
## 20 0.000 25
## 21 0.000 24
## 22 0.000 32
## 23 0.000 22
## 24 0.000 32
## 25 0.000 23
## 26 0.000 9
## 27 0.000 20
## 28 0.000 14
## 29 0.000 14
## 30 0.000 16
## 31 0.000 9
## 32 0.000 20
Remove CDR3s with occurrence in less than 5 samples
# TRA
keep_tra_cd4 <- excel_count_table_tra %>%
mutate(nkt_trav11_traj18 = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11") |
(grepl(allJHitsWithScore, pattern = "TRAJ18"))),"yes","no")) %>%
filter(nkt_trav11_traj18 == "no") %>%
select(aaSeqCDR3, starts_with("CD4")) %>%
mutate_at(vars(starts_with("CD4")), .funs = binary)
keep_tra_cd4$sum <- rowSums((keep_tra_cd4 %>% select(-aaSeqCDR3)))
keep_tra_cd4$keep <- ifelse(keep_tra_cd4$sum>4,1,0)
keep_tra_cd4_sequences <- pull(keep_tra_cd4 %>% filter(keep == 1), aaSeqCDR3)
# TRB
keep_trb_cd4 <- excel_count_table_trb %>% select(aaSeqCDR3, starts_with("CD4")) %>% mutate_at(vars(starts_with("CD4")), .funs = binary)
keep_trb_cd4$sum <- rowSums((keep_trb_cd4 %>% select(-aaSeqCDR3)))
keep_trb_cd4$keep <- ifelse(keep_trb_cd4$sum>4,1,0)
keep_trb_cd4_sequences <- pull(keep_trb_cd4 %>% filter(keep == 1), aaSeqCDR3)Remove CDR3s with occurrence in less than 5 samples
# TRA
keep_tra_cd8 <- excel_count_table_tra %>%
mutate(nkt_trav11_traj18 = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11") |
(grepl(allJHitsWithScore, pattern = "TRAJ18"))),"yes","no")) %>%
filter(nkt_trav11_traj18 == "no") %>%
select(aaSeqCDR3, starts_with("CD8")) %>%
mutate_at(vars(starts_with("CD8")), .funs = binary)
keep_tra_cd8$sum <- rowSums((keep_tra_cd8 %>% select(-aaSeqCDR3)))
keep_tra_cd8$keep <- ifelse(keep_tra_cd8$sum>4,1,0)
keep_tra_cd8_sequences <- pull(keep_tra_cd8 %>% filter(keep == 1), aaSeqCDR3)
# TRB
keep_trb_cd8 <- excel_count_table_trb %>% select(aaSeqCDR3, starts_with("CD8")) %>% mutate_at(vars(starts_with("CD8")), .funs = binary)
keep_trb_cd8$sum <- rowSums((keep_trb_cd8 %>% select(-aaSeqCDR3)))
keep_trb_cd8$keep <- ifelse(keep_trb_cd8$sum>4,1,0)
keep_trb_cd8_sequences <- pull(keep_trb_cd8 %>% filter(keep == 1), aaSeqCDR3)excel_count_table_trb3_filter <- excel_count_table_trb2 %>% filter(aaSeqCDR3 %in% keep_trb_cd4_sequences | aaSeqCDR3 %in% keep_trb_cd8_sequences)
count_table_trb4_filter <- as.matrix(excel_count_table_trb3_filter[,5:32])
rownames(count_table_trb4_filter) <- excel_count_table_trb3_filter$aaSeqCDR3
trb4_norm_filter <- scale(count_table_trb4_filter, center=FALSE, scale=colSums(count_table_trb4_filter))
prop.table.trb_filter <- cbind(trb4_norm_filter, excel_count_table_trb3_filter %>% select(-starts_with("CD")) )excel_count_table_tra3_filter <- excel_count_table_tra3 %>% filter(aaSeqCDR3 %in% keep_tra_cd4_sequences | aaSeqCDR3 %in% keep_tra_cd8_sequences) %>% filter(nkt_trav11_traj18 == "no")
count_table_tra4_filter <- as.matrix(excel_count_table_tra3_filter[,5:32])
rownames(count_table_tra4_filter) <- excel_count_table_tra3_filter$aaSeqCDR3
tra4_norm_filter <- scale(count_table_tra4_filter, center=FALSE, scale=colSums(count_table_tra4_filter))
prop.table.tra_filter <- cbind(tra4_norm_filter, excel_count_table_tra3_filter %>% select(-starts_with("CD")) )
colnames(prop.table.tra_filter)## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allDHitsWithScore"
## [31] "allJHitsWithScore" "allVHitsWithScore"
## [33] "nkt_trav11_traj18"
immdata_tra <- readRDS("immdata_tra.rds")
immdata_trb <- readRDS("immdata_trb.rds")### Thymus
cd4_thymus <- geneUsage(repFilter(immdata_tra,
.method = "by.meta",
.query = list(Organ = include('Thymus'), Cell_type = include('CD4')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trav")
vis(cd4_thymus, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_tra$meta) + ggtitle("TRA Thymus CD4") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd4_thymus_tra_just1segment.png", width = 60, height = 12, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd4_thymus_tra_just1segment.svg", width = 60, height = 12, units = "cm")
### LN
cd4_ln <- geneUsage(repFilter(immdata_tra,
.method = "by.meta",
.query = list(Organ = exclude('Thymus'), Cell_type = include('CD4')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trav")
vis(cd4_ln, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_tra$meta) + ggtitle("TRA LN CD4") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd4_ln_tra_just1segment.png", width = 60, height = 12, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd4_ln_tra_just1segment.svg", width = 60, height = 12, units = "cm")
cd8_thymus <- geneUsage(repFilter(immdata_tra,
.method = "by.meta",
.query = list(Organ = include('Thymus'), Cell_type = include('CD8')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trav")
vis(cd8_thymus, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_tra$meta) + ggtitle("TRA Thymus CD8") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd8_thymus_tra_just1segment.png", width = 60, height = 12, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd8_thymus_tra_just1segment.svg", width = 60, height = 12, units = "cm")
cd8_ln <- geneUsage(repFilter(immdata_tra,
.method = "by.meta",
.query = list(Organ = exclude('Thymus'), Cell_type = include('CD8')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trav")
vis(cd8_ln, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_tra$meta) + ggtitle("TRA LN CD8") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd8_ln_tra_just1segment.png", width = 60, height = 12, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd8_ln_tra_just1segment.svg", width = 60, height = 12, units = "cm")### Thymus
cd4_thymus <- geneUsage(repFilter(immdata_trb,
.method = "by.meta",
.query = list(Organ = include('Thymus'), Cell_type = include('CD4')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trbv")
vis(cd4_thymus, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_trb$meta, .test = F) + ggtitle("TRB Thymus CD4") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70")) #ggsave("./final_fig/gene_usage_new/cd4_thymus_trb_just1segment.png", width = 20, height = 10, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd4_thymus_trb_just1segment.svg", width = 20, height = 10, units = "cm")
### LN
cd4_ln <- geneUsage(repFilter(immdata_trb,
.method = "by.meta",
.query = list(Organ = exclude('Thymus'), Cell_type = include('CD4')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trbv")
vis(cd4_ln, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_trb$meta, .test = F) + ggtitle("TRB LN CD4") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd4_ln_trb_just1segment.png", width = 20, height = 10, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd4_ln_trb_just1segment.svg", width = 20, height = 10, units = "cm")
cd8_thymus <- geneUsage(repFilter(immdata_trb,
.method = "by.meta",
.query = list(Organ = include('Thymus'), Cell_type = include('CD8')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trbv")
vis(cd8_thymus, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_trb$meta, .test = F) + ggtitle("TRB Thymus CD8") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd8_thymus_trb_just1segment.png", width = 20, height = 10, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd8_thymus_trb_just1segment.svg", width = 20, height = 10, units = "cm")
cd8_ln <- geneUsage(repFilter(immdata_trb,
.method = "by.meta",
.query = list(Organ = exclude('Thymus'), Cell_type = include('CD8')))$data, .norm = T, .ambig = "maj", .quant = "count", .gene = "musmus.trbv")
vis(cd8_ln, .by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_trb$meta, .test = F) + ggtitle("TRB LN CD8") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70"))#ggsave("./final_fig/gene_usage_new/cd8_ln_trb_just1segment.png", width = 20, height = 10, units = "cm")
#ggsave("./final_fig/gene_usage_new/cd8_ln_trb_just1segment.svg", width = 20, height = 10, units = "cm")Next, we will identify gene segments that are typical for invariant NKT cells. These segments are TRAV11 and TRAJ18 for TRA and TRBV1, TRBV13-2 and TRBV29 for TRB. Please, note that some of the included TRB chains are used by conventional cells as well.
Counts of NKT gene segments:
# NKT analysis
merged_repertoires <- merged_repertoires %>%
mutate(is_nkt = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11")) & (grepl(allJHitsWithScore, pattern = "TRAJ18")) |
(grepl(allVHitsWithScore, pattern = "TRBV13-2")) |
(grepl(allVHitsWithScore, pattern = "TRBV1\\*")) |
(grepl(allVHitsWithScore, pattern = "TRBV29")) ,"yes","no")) %>%
mutate(nkt_trav11_traj18 = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11") & (grepl(allJHitsWithScore, pattern = "TRAJ18"))),"yes","no")) %>% mutate(nkt_trbv13_2 = if_else(
(grepl(allVHitsWithScore, pattern = "TRBV13-2")),"yes","no")) %>% mutate(nkt_trbv29 = if_else(
(grepl(allVHitsWithScore, pattern = "TRBV29")),"yes","no")) %>% mutate(nkt_trbv1 = if_else(
(grepl(allVHitsWithScore, pattern = "TRBV1\\*")),"yes","no"))
md <- md %>% select(Exp, Organ, Cell_type, Mouse_strain, num_id) %>% mutate_at(vars("num_id"), as.numeric)
# All TRB NKT
nkt_table_trb <- merged_repertoires %>%
filter(chain == "TRB") %>%
mutate(new_name = paste(Exp, Organ, Cell_type, Mouse_strain)) %>%
dplyr::select(new_name, cloneCount, is_nkt, num_id) %>%
uncount(cloneCount) %>%
group_by(num_id, is_nkt) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)) %>% dplyr::filter(is_nkt == "yes") %>%
ungroup %>%
left_join(md) %>%
unique %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain))
# TRAV11 TRAJ18
nkt_table_trav11_traj18 <- merged_repertoires %>%
filter(chain == "TRA") %>%
mutate(new_name = paste(Exp, Organ, Cell_type, Mouse_strain)) %>%
dplyr::select(new_name, cloneCount, nkt_trav11_traj18, num_id) %>%
uncount(cloneCount) %>%
group_by(num_id, nkt_trav11_traj18) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)) %>% dplyr::filter(nkt_trav11_traj18 == "yes") %>%
ungroup %>%
left_join(md) %>%
unique %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain))
# TRBV1
nkt_table_trbv1 <- merged_repertoires %>%
filter(chain == "TRB") %>%
mutate(new_name = paste(Exp, Organ, Cell_type, Mouse_strain)) %>%
dplyr::select(new_name, cloneCount, nkt_trbv1, num_id) %>%
uncount(cloneCount) %>%
group_by(num_id, nkt_trbv1) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)) %>% dplyr::filter(nkt_trbv1 == "yes") %>%
ungroup %>%
left_join(md) %>%
unique %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain))
# TRBV13-2
nkt_table_trbv13_2 <- merged_repertoires %>%
filter(chain == "TRB") %>%
mutate(new_name = paste(Exp, Organ, Cell_type, Mouse_strain)) %>%
dplyr::select(new_name, cloneCount, nkt_trbv13_2, num_id) %>%
uncount(cloneCount) %>%
group_by(num_id, nkt_trbv13_2) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)) %>% dplyr::filter(nkt_trbv13_2 == "yes") %>%
ungroup %>%
left_join(md) %>%
unique %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain))
# TRBV29
nkt_table_trbv29 <- merged_repertoires %>%
filter(chain == "TRB") %>%
mutate(new_name = paste(Exp, Organ, Cell_type, Mouse_strain)) %>%
dplyr::select(new_name, cloneCount, nkt_trbv29, num_id) %>%
uncount(cloneCount) %>%
group_by(num_id, nkt_trbv29) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n)) %>% dplyr::filter(nkt_trbv29 == "yes") %>%
ungroup %>%
left_join(md) %>%
unique %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain))
levels_cd4 <- c("CD4 Thymus WT", "CD4 Thymus CA", "CD4 Thymus CAKR", "CD4 Lymph nodes WT", "CD4 Lymph nodes CA", "CD4 Lymph nodes CAKR")
levels_cd8 <- c("CD8 Thymus WT", "CD8 Thymus CA", "CD8 Thymus CAKR", "CD8 Lymph nodes WT", "CD8 Lymph nodes CA", "CD8 Lymph nodes CAKR")nkt_table_trb$nkt_sample <- "all_nkt_trb"
nkt_table_trav11_traj18$nkt_sample <- "trav11_traj18"
nkt_table_trbv1$nkt_sample <- "trbv1"
nkt_table_trbv13_2$nkt_sample <- "trbv13_2"
nkt_table_trbv29$nkt_sample <- "trbv29"
nkt_table <- cbind(nkt_table_trb$num_id, nkt_table_trb$sample_type, nkt_table_trb$Exp, format(nkt_table_trb$freq, digits = 2),
format(nkt_table_trav11_traj18$freq, digits = 2),
format(nkt_table_trbv1$freq, digits = 2),
format(nkt_table_trbv13_2$freq, digits = 2),
format(nkt_table_trbv29$freq, digits = 2)) %>% as.data.frame
colnames(nkt_table) <- c("num_id","Sample","Exp","All TRB NKT", "TRAV11-TRAJ18", "TRBV1","TRBV13-2","TRBV29")
nkt_table <- nkt_table %>% mutate(Sample = paste(Sample, Exp)) %>% select(-num_id, -Exp) %>% arrange(Sample)
kable(nkt_table, format = "html") %>%
kable_styling(full_width = F, font_size = 11,
bootstrap_options = c("striped", "hover", "condensed", "responsive"))| Sample | All TRB NKT | TRAV11-TRAJ18 | TRBV1 | TRBV13-2 | TRBV29 |
|---|---|---|---|---|---|
| CD4 Lymph nodes CA 1 | 0.18 | 0.02574 | 0.032 | 0.120 | 0.027 |
| CD4 Lymph nodes CA 2 | 0.19 | 0.02326 | 0.031 | 0.124 | 0.031 |
| CD4 Lymph nodes CA 3 | 0.16 | 0.02046 | 0.024 | 0.109 | 0.028 |
| CD4 Lymph nodes CAKR 2 | 0.19 | 0.01704 | 0.031 | 0.127 | 0.029 |
| CD4 Lymph nodes CAKR 3 | 0.18 | 0.02357 | 0.034 | 0.120 | 0.030 |
| CD4 Lymph nodes WT 2 | 0.17 | 0.00474 | 0.028 | 0.114 | 0.029 |
| CD4 Lymph nodes WT 3 | 0.17 | 0.00307 | 0.031 | 0.111 | 0.029 |
| CD4 Thymus CA 1 | 0.36 | 0.25798 | 0.032 | 0.228 | 0.101 |
| CD4 Thymus CA 3 | 0.38 | 0.26069 | 0.033 | 0.219 | 0.124 |
| CD4 Thymus CAKR 1 | 0.30 | 0.22058 | 0.033 | 0.205 | 0.065 |
| CD4 Thymus CAKR 2 | 0.32 | 0.38268 | 0.069 | 0.185 | 0.068 |
| CD4 Thymus CAKR 3 | 0.38 | 0.23262 | 0.042 | 0.249 | 0.084 |
| CD4 Thymus WT 2 | 0.23 | 0.08824 | 0.034 | 0.136 | 0.058 |
| CD4 Thymus WT 3 | 0.19 | 0.06685 | 0.038 | 0.107 | 0.045 |
| CD8 Lymph nodes CA 2 | 0.21 | 0.00076 | 0.028 | 0.084 | 0.099 |
| CD8 Lymph nodes CA 3 | 0.20 | 0.00137 | 0.026 | 0.085 | 0.091 |
| CD8 Lymph nodes CAKR 2 | 0.24 | 0.00084 | 0.030 | 0.122 | 0.087 |
| CD8 Lymph nodes CAKR 3 | 0.20 | 0.00080 | 0.027 | 0.089 | 0.082 |
| CD8 Lymph nodes WT 1 | 0.23 | 0.00117 | 0.028 | 0.108 | 0.096 |
| CD8 Lymph nodes WT 2 | 0.21 | 0.00051 | 0.024 | 0.091 | 0.098 |
| CD8 Lymph nodes WT 3 | 0.21 | 0.00091 | 0.024 | 0.089 | 0.095 |
| CD8 Thymus CA 1 | 0.22 | 0.00219 | 0.029 | 0.116 | 0.078 |
| CD8 Thymus CA 3 | 0.19 | 0.00102 | 0.024 | 0.084 | 0.078 |
| CD8 Thymus CAKR 2 | 0.23 | 0.00521 | 0.028 | 0.124 | 0.076 |
| CD8 Thymus CAKR 3 | 0.19 | 0.00188 | 0.032 | 0.105 | 0.056 |
| CD8 Thymus WT 1 | 0.23 | 0.00126 | 0.022 | 0.111 | 0.101 |
| CD8 Thymus WT 2 | 0.26 | 0.00115 | 0.027 | 0.129 | 0.102 |
| CD8 Thymus WT 3 | 0.20 | 0.00126 | 0.023 | 0.086 | 0.090 |
## CD4
p01 <- nkt_table_trb %>%
filter(Cell_type == "CD4") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("NKT segments in CD4 TRB") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_all_nkt_trb_cd4.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_all_nkt_trb_cd4.eps", width = 8, height = 8, units = "cm")
## CD8
p02 <- nkt_table_trb %>%
filter(Cell_type == "CD8") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("NKT segments in CD8 TRB") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_all_nkt_trb_cd8.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_all_nkt_trb_cd8.eps", width = 8, height = 8, units = "cm")
p01 + p02## CD4
p03 <- nkt_table_trav11_traj18 %>%
filter(Cell_type == "CD4") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("TRAV11-TRAJ18 segments in CD4 cells") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trav11_traj18_tra_cd4.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trav11_traj18_tra_cd4.eps", width = 8, height = 8, units = "cm")
## CD8
p04 <- nkt_table_trav11_traj18 %>%
filter(Cell_type == "CD8") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("TRAV11-TRAJ18 segments in CD8 cells") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,1)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trav11_traj18_tra_cd8.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trav11_traj18_tra_cd8.eps", width = 8, height = 8, units = "cm")
p03 + p04## CD4
p05 <- nkt_table_trbv1 %>%
filter(Cell_type == "CD4") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("CD4 TRBV1") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv1_cd4.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv1_cd4.eps", width = 8, height = 8, units = "cm")
## CD8
p05 <- nkt_table_trbv1 %>%
filter(Cell_type == "CD8") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("CD8 TRBV1") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv1_cd8.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv1_cd8.eps", width = 8, height = 8, units = "cm")
p05 + p06## Error in eval(expr, envir, enclos): object 'p06' not found
## CD4
p07 <- nkt_table_trbv13_2 %>%
filter(Cell_type == "CD4") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("TRBV13-2 segments in CD4 cells") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv13_2_cd4.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv13_2_cd4.eps", width = 8, height = 8, units = "cm")
## CD8
p08 <- nkt_table_trbv13_2 %>%
filter(Cell_type == "CD8") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("TRBV13-2 segments in CD8 cells") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv13_2_cd8.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv13_2_cd8.eps", width = 8, height = 8, units = "cm")
p07 + p08## CD4
p09 <- nkt_table_trbv29 %>%
filter(Cell_type == "CD4") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("CD4 TRBV29") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv29_cd4.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv29_cd4.eps", width = 8, height = 8, units = "cm")
## CD8
p10 <- nkt_table_trbv29 %>%
filter(Cell_type == "CD8") %>%
ggplot(aes(y = freq*100, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Percentage of all chains") +
ggtitle("CD8 TRBV29") +
theme_classic() +
ggtheme() +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")
#ggsave("./final_fig/nkt/pct_nkt_trbv29_cd8.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/nkt/pct_nkt_trbv29_cd8.eps", width = 8, height = 8, units = "cm")
p09 + p10prop.table.tra2_filter.cd4 <- prop.table.tra_filter[,1:14]
prop.table.tra2_filter.cd4$sum <- rowSums(prop.table.tra2_filter.cd4)
prop.table.tra2_filter.cd4 <- prop.table.tra2_filter.cd4 %>% filter(sum >0) %>% select(-sum)
prop.table.trb2_filter.cd4 <- prop.table.trb_filter[,1:14]
prop.table.trb2_filter.cd4$sum <- rowSums(prop.table.trb2_filter.cd4)
prop.table.trb2_filter.cd4 <- prop.table.trb2_filter.cd4 %>% filter(sum >0) %>% select(-sum)
prop.table.tra2_filter.cd8 <- prop.table.tra_filter[,15:28]
prop.table.tra2_filter.cd8$sum <- rowSums(prop.table.tra2_filter.cd8)
prop.table.tra2_filter.cd8 <- prop.table.tra2_filter.cd8 %>% filter(sum >0) %>% select(-sum)
prop.table.trb2_filter.cd8 <- prop.table.trb_filter[,15:28]
prop.table.trb2_filter.cd8$sum <- rowSums(prop.table.trb2_filter.cd8)
prop.table.trb2_filter.cd8 <- prop.table.trb2_filter.cd8 %>% filter(sum >0) %>% select(-sum)prop.table.tra2_filter.merge <- rbind(prop.table.tra2_filter.cd4, prop.table.trb2_filter.cd4)
res.pca.merge.cd4 <- prcomp(t(prop.table.tra2_filter.merge), scale = TRUE, center = T)
mdres.pca.merge.cd4 <- colnames(prop.table.tra2_filter.merge[,1:14]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.merge.cd4,
col.ind = as.factor(mdres.pca.merge.cd4$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.merge.cd4$Organ),
color = as.factor(mdres.pca.merge.cd4$Strain))) +
scale_shape_manual(values = c(8,8,8,15,15)) +
ggtheme() +
theme(axis.text.x = element_text(angle = 0))#ggsave("final_fig/pca/cd4.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd4.svg", width = 2.6, height = 1.7)Zoom
fviz_pca_ind(res.pca.merge.cd4,
col.ind = as.factor(mdres.pca.merge.cd4$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.merge.cd4$Organ),
color = as.factor(mdres.pca.merge.cd4$Strain))) +
scale_shape_manual(values = c(8,8,8,15,15)) +
xlim(c(-30,0)) + ylim(-2,15) +
ggtheme() +
theme(axis.text.x = element_text(angle = 0))#ggsave("final_fig/pca/cd4_zoom.png", width = 2.2, height = 1.4)
#ggsave("final_fig/pca/cd4_zoom.svg", width = 2.2, height = 1.4)prop.table.tra2_filter.merge <- rbind(prop.table.tra2_filter.cd8, prop.table.trb2_filter.cd8)
res.pca.merge.cd8 <- prcomp(t(prop.table.tra2_filter.merge), scale = TRUE, center = T)
mdres.pca.merge.cd8 <- colnames(prop.table.tra2_filter.merge[,1:14]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.merge.cd8,
col.ind = as.factor(mdres.pca.merge.cd8$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.merge.cd8$Organ),
color = as.factor(mdres.pca.merge.cd8$Strain))) +
scale_shape_manual(values = c(8,8,8,15,15)) +
ggtheme()#ggsave("final_fig/pca/cd8.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd8.svg", width = 2.6, height = 1.7)Zoom
fviz_pca_ind(res.pca.merge.cd8,
col.ind = as.factor(mdres.pca.merge.cd8$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.merge.cd8$Organ),
color = as.factor(mdres.pca.merge.cd8$Strain))) +
scale_shape_manual(values = c(8,8,8,15,15)) +
xlim(c(-32,-18)) + ylim(-4,5) +
ggtheme()#ggsave("final_fig/pca/cd8_zoom.png", width = 2.2, height = 1.4)
#ggsave("final_fig/pca/cd8_zoom.svg", width = 2.2, height = 1.4)res.pca <- prcomp(t(prop.table.tra_filter[,1:28]), scale = TRUE)
md <- colnames(prop.table.tra_filter[,1:28]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))## All
prop.table.tra2_filter.cd4 <- prop.table.tra_filter[,1:14]
prop.table.tra2_filter.cd4$sum <- rowSums(prop.table.tra2_filter.cd4)
prop.table.tra2_filter.cd4 <- prop.table.tra2_filter.cd4 %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd4 <- prcomp(t(prop.table.tra2_filter.cd4), scale = TRUE, center = T)
mdres.pca.tra.cd4 <- colnames(prop.table.tra_filter[,1:14]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
# LN
prop.table.tra2_filter %>% colnames## Error in is.data.frame(x): object 'prop.table.tra2_filter' not found
prop.table.tra2_filter.cd4.ln <- prop.table.tra_filter[,1:7]
prop.table.tra2_filter.cd4.ln$sum <- rowSums(prop.table.tra2_filter.cd4.ln)
prop.table.tra2_filter.cd4.ln <- prop.table.tra2_filter.cd4.ln %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd4.ln <- prcomp(t(prop.table.tra2_filter.cd4.ln), scale = TRUE, center = T)
mdres.pca.tra.cd4.ln <- colnames(prop.table.tra_filter[,1:7]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.tra.cd4.ln,
col.ind = as.factor(mdres.pca.tra.cd4.ln$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.tra.cd4.ln$Organ),
color = as.factor(mdres.pca.tra.cd4.ln$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd4_tra_ln.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd4_tra_ln.svg", width = 2.6, height = 1.7)
# Thymus
prop.table.tra2_filter.cd4.thy <- prop.table.tra_filter[,8:14]
prop.table.tra2_filter.cd4.thy$sum <- rowSums(prop.table.tra2_filter.cd4.thy)
prop.table.tra2_filter.cd4.thy <- prop.table.tra2_filter.cd4.thy %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd4.thy <- prcomp(t(prop.table.tra2_filter.cd4.thy), scale = TRUE, center = T)
mdres.pca.tra.cd4.thy <- colnames(prop.table.tra_filter[,8:14]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.tra.cd4.thy,
col.ind = as.factor(mdres.pca.tra.cd4.thy$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.tra.cd4.thy$Organ),
color = as.factor(mdres.pca.tra.cd4.thy$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") +
ggtheme()#ggsave("final_fig/pca/cd4_tra_thy.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd4_tra_thy.svg", width = 2.6, height = 1.7)## All
prop.table.tra2_filter.cd8 <- prop.table.tra_filter[,15:28]
prop.table.tra2_filter.cd8$sum <- rowSums(prop.table.tra2_filter.cd8)
prop.table.tra2_filter.cd8 <- prop.table.tra2_filter.cd8 %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd8 <- prcomp(t(prop.table.tra2_filter.cd8), scale = TRUE, center = T)
mdres.pca.tra.cd8 <- colnames(prop.table.tra2_filter[,15:28]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))## Error in is.data.frame(x): object 'prop.table.tra2_filter' not found
# LN
prop.table.tra2_filter %>% colnames## Error in is.data.frame(x): object 'prop.table.tra2_filter' not found
prop.table.tra2_filter.cd8.ln <- prop.table.tra_filter[,15:21]
prop.table.tra2_filter.cd8.ln$sum <- rowSums(prop.table.tra2_filter.cd8.ln)
prop.table.tra2_filter.cd8.ln <- prop.table.tra2_filter.cd8.ln %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd8.ln <- prcomp(t(prop.table.tra2_filter.cd8.ln), scale = TRUE, center = T)
mdres.pca.tra.cd8.ln <- colnames(prop.table.tra_filter[,15:21]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.tra.cd8.ln,
col.ind = as.factor(mdres.pca.tra.cd8.ln$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.tra.cd8.ln$Organ),
color = as.factor(mdres.pca.tra.cd8.ln$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd8_tra_ln.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd8_tra_ln.svg", width = 2.6, height = 1.7)
# Thymus
prop.table.tra2_filter %>% colnames## Error in is.data.frame(x): object 'prop.table.tra2_filter' not found
prop.table.tra2_filter.cd8.thy <- prop.table.tra_filter[,22:28]
prop.table.tra2_filter.cd8.thy$sum <- rowSums(prop.table.tra2_filter.cd8.thy)
prop.table.tra2_filter.cd8.thy <- prop.table.tra2_filter.cd8.thy %>% filter(sum >0) %>% select(-sum)
res.pca.tra.cd8.thy <- prcomp(t(prop.table.tra2_filter.cd8.thy), scale = TRUE, center = T)
mdres.pca.tra.cd8.thy <- colnames(prop.table.tra_filter[,22:28]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.tra.cd8.thy,
col.ind = as.factor(mdres.pca.tra.cd8.thy$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.tra.cd8.thy$Organ),
color = as.factor(mdres.pca.tra.cd8.thy$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd8_tra_thy.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd8_tra_thy.svg", width = 2.6, height = 1.7)res.pca.trb <- prcomp(t(prop.table.trb_filter[,5:32]), scale = TRUE)## Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
md_trb <- colnames(prop.table.trb_filter[,5:32]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))## All
colnames(prop.table.trb2_filter)## Error in is.data.frame(x): object 'prop.table.trb2_filter' not found
prop.table.trb2_filter.cd4 <- prop.table.trb_filter[,5:18]
prop.table.trb2_filter.cd4$sum <- rowSums(prop.table.trb2_filter.cd4)
prop.table.trb2_filter.cd4 <- prop.table.trb2_filter.cd4 %>% filter(sum >0) %>% select(-sum)
res.pca.trb.cd4 <- prcomp(t(prop.table.trb2_filter.cd4), scale = TRUE, center = T)
mdres.pca.trb.cd4 <- colnames(prop.table.trb_filter[,5:18]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
# LN
prop.table.trb_filter %>% colnames## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allVHitsWithScore"
## [31] "allDHitsWithScore" "allJHitsWithScore"
prop.table.trb2_filter.cd4.ln <- prop.table.trb_filter[,5:11]
prop.table.trb2_filter.cd4.ln$sum <- rowSums(prop.table.trb2_filter.cd4.ln)
prop.table.trb2_filter.cd4.ln <- prop.table.trb2_filter.cd4.ln %>% filter(sum >0) %>% select(-sum)
res.pca.trb.cd4.ln <- prcomp(t(prop.table.trb2_filter.cd4.ln), scale = TRUE, center = T)
mdres.pca.trb.cd4.ln <- colnames(prop.table.trb_filter[,5:11]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.trb.cd4.ln,
col.ind = as.factor(mdres.pca.trb.cd4.ln$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.trb.cd4.ln$Organ),
color = as.factor(mdres.pca.trb.cd4.ln$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd4_trb_ln.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd4_trb_ln.svg", width = 2.6, height = 1.7)
# Thymus
prop.table.trb_filter %>% colnames## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allVHitsWithScore"
## [31] "allDHitsWithScore" "allJHitsWithScore"
prop.table.trb2_filter.cd4.thy <- prop.table.trb_filter[,12:18]
prop.table.trb2_filter.cd4.thy$sum <- rowSums(prop.table.trb2_filter.cd4.thy)
prop.table.trb2_filter.cd4.thy <- prop.table.trb2_filter.cd4.thy %>% filter(sum >0) %>% select(-sum)
res.pca.trb.cd4.thy <- prcomp(t(prop.table.trb2_filter.cd4.thy), scale = TRUE, center = T)
mdres.pca.trb.cd4.thy <- colnames(prop.table.trb_filter[,12:18]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.trb.cd4.thy,
col.ind = as.factor(mdres.pca.trb.cd4.thy$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.trb.cd4.thy$Organ),
color = as.factor(mdres.pca.trb.cd4.thy$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd4_trb_thy.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd4_trb_thy.svg", width = 2.6, height = 1.7)## All
prop.table.trb2_filter.cd8 <- prop.table.trb_filter[,19:32]
prop.table.trb2_filter.cd8$sum <- rowSums(prop.table.trb2_filter.cd8)## Error in rowSums(prop.table.trb2_filter.cd8): 'x' must be numeric
prop.table.trb2_filter.cd8 <- prop.table.trb2_filter.cd8 %>% filter(sum >0) %>% select(-sum)## Error in `filter()`:
## ! Problem while computing `..1 = sum > 0`.
## Caused by error in `sum > 0`:
## ! comparison (6) is possible only for atomic and list types
res.pca.trb.cd8 <- prcomp(t(prop.table.trb2_filter.cd8), scale = TRUE, center = T)## Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
mdres.pca.trb.cd8 <- colnames(prop.table.trb_filter[,19:32]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
# LN
prop.table.trb_filter %>% colnames## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allVHitsWithScore"
## [31] "allDHitsWithScore" "allJHitsWithScore"
prop.table.trb2_filter.cd8.ln <- prop.table.trb_filter[,19:25]
prop.table.trb2_filter.cd8.ln$sum <- rowSums(prop.table.trb2_filter.cd8.ln)
prop.table.trb2_filter.cd8.ln <- prop.table.trb2_filter.cd8.ln %>% filter(sum >0) %>% select(-sum)
res.pca.trb.cd8.ln <- prcomp(t(prop.table.trb2_filter.cd8.ln), scale = TRUE, center = T)
mdres.pca.trb.cd8.ln <- colnames(prop.table.trb_filter[,19:25]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.trb.cd8.ln,
col.ind = as.factor(mdres.pca.trb.cd8.ln$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.trb.cd8.ln$Organ),
color = as.factor(mdres.pca.trb.cd8.ln$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") #ggsave("final_fig/pca/cd8_trb_ln.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd8_trb_ln.svg", width = 2.6, height = 1.7)
# Thymus
prop.table.trb_filter %>% colnames## [1] "CD4 Lymph nodes CA Exp01" "CD4 Lymph nodes CA Exp02"
## [3] "CD4 Lymph nodes CA Exp03" "CD4 Lymph nodes CAKR Exp02"
## [5] "CD4 Lymph nodes CAKR Exp03" "CD4 Lymph nodes WT Exp02"
## [7] "CD4 Lymph nodes WT Exp03" "CD4 Thymus CA Exp01"
## [9] "CD4 Thymus CA Exp03" "CD4 Thymus CAKR Exp01"
## [11] "CD4 Thymus CAKR Exp02" "CD4 Thymus CAKR Exp03"
## [13] "CD4 Thymus WT Exp02" "CD4 Thymus WT Exp03"
## [15] "CD8 Lymph nodes CA Exp02" "CD8 Lymph nodes CA Exp03"
## [17] "CD8 Lymph nodes CAKR Exp02" "CD8 Lymph nodes CAKR Exp03"
## [19] "CD8 Lymph nodes WT Exp01" "CD8 Lymph nodes WT Exp02"
## [21] "CD8 Lymph nodes WT Exp03" "CD8 Thymus CA Exp01"
## [23] "CD8 Thymus CA Exp03" "CD8 Thymus CAKR Exp02"
## [25] "CD8 Thymus CAKR Exp03" "CD8 Thymus WT Exp01"
## [27] "CD8 Thymus WT Exp02" "CD8 Thymus WT Exp03"
## [29] "aaSeqCDR3" "allVHitsWithScore"
## [31] "allDHitsWithScore" "allJHitsWithScore"
prop.table.trb2_filter.cd8.thy <- prop.table.trb_filter[,26:32]
prop.table.trb2_filter.cd8.thy$sum <- rowSums(prop.table.trb2_filter.cd8.thy)## Error in rowSums(prop.table.trb2_filter.cd8.thy): 'x' must be numeric
prop.table.trb2_filter.cd8.thy <- prop.table.trb2_filter.cd8.thy %>% filter(sum >0) %>% select(-sum)## Error in `filter()`:
## ! Problem while computing `..1 = sum > 0`.
## Caused by error in `sum > 0`:
## ! comparison (6) is possible only for atomic and list types
res.pca.trb.cd8.thy <- prcomp(t(prop.table.trb2_filter.cd8.thy), scale = TRUE, center = T)## Error in colMeans(x, na.rm = TRUE): 'x' must be numeric
mdres.pca.trb.cd8.thy <- colnames(prop.table.trb_filter[,26:32]) %>%
as.data.frame() %>%
mutate(sample = stringr::str_replace_all(., pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp"))
fviz_pca_ind(res.pca.trb.cd8.thy,
col.ind = as.factor(mdres.pca.trb.cd8.thy$Strain), # color by groups
legend.title = "Groups",
geom.ind = "point",
invisible="quali", pointsize = 0) +
scale_color_manual(values=c("dodgerblue","indianred2","gray40")) +
geom_point(aes(shape = as.factor(mdres.pca.trb.cd8.thy$Organ),
color = as.factor(mdres.pca.trb.cd8.thy$Strain))) +
geom_hline(yintercept = 0, color = "grey20") +
geom_vline(xintercept = 0, color = "grey20") ## Error in .get_facto_class(X): object 'res.pca.trb.cd8.thy' not found
#ggsave("final_fig/pca/cd8_trb_thy.png", width = 2.6, height = 1.7)
#ggsave("final_fig/pca/cd8_trb_thy.svg", width = 2.6, height = 1.7)immdata_tra_wonkt <- repFilter(immdata_tra, "by.clonotype",
list(V.name = exclude("TRAV11"), J.name = exclude("TRAJ18")),
.match = "substring")
## ALL TRA
repDiversity(immdata_tra_wonkt$data) %>% vis(.by = c( "Mouse_strain"), .meta = immdata_tra$meta,
.signif.label.size = 0) + ggtitle("TRA without NKT") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70")) + ggtitle("all TRA", subtitle = "")#ggsave("./final_fig/diversity/chao_tra_wonkt_sign.png", width = 8, height = 13, units = "cm")
#ggsave("./final_fig/diversity/chao_tra_wonkt_sign.svg", width = 8, height = 13, units = "cm")
## ALL TRB
repDiversity(immdata_trb$data) %>% vis(.by = c( "Mouse_strain"), .meta = immdata_trb$meta,
.signif.label.size = 0) + ggtitle("TRB without NKT") + scale_fill_manual(values = c("dodgerblue","indianred2","gray70")) + ggtitle("all TRB", subtitle = "")#ggsave("./final_fig/diversity/chao_trb_sign.png", width = 8, height = 13, units = "cm")
#ggsave("./final_fig/diversity/chao_trb_sign.svg", width = 8, height = 13, units = "cm")
## Chao index
repDiversity(immdata_tra_wonkt$data) %>% vis(.by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_tra$meta, .test = F,
.signif.label.size = 0) + ggtitle("TRA")#ggsave("./plots/chao_tra.png", width = 15, height = 11, units = "cm")
#ggsave("./plots/chao_tra.svg", width = 15, height = 11, units = "cm")
repDiversity(immdata_trb$data) %>% vis(.by = c("Cell_type", "Organ","Mouse_strain"), .meta = immdata_trb$meta, .test = F,
.signif.label.size = 0) + ggtitle("TRB", subtitle = "")#ggsave("./plots/chao_trb.png", width = 15, height = 11, units = "cm")
#ggsave("./plots/chao_trb.svg", width = 15, height = 11, units = "cm")chao_tra <- as.data.frame(repDiversity(immdata_tra_wonkt$data))
chao_trb <- as.data.frame(repDiversity(immdata_trb$data))chao_tra2 <- chao_tra %>%
rownames_to_column("num_id") %>%
mutate(num_id = str_replace_all(string = num_id, c("new_lib_24" = "lib24_S24_L001", "new_lib_25" = "lib25_S25_L001"))) %>%
separate(num_id, into = c("num_id",NA,NA), sep = "_") %>%
mutate(num_id = str_replace(num_id, "lib","")) %>%
mutate_at(vars("num_id"), as.numeric)
levels_cd4 <- c("CD4 Thymus WT", "CD4 Thymus CA", "CD4 Thymus CAKR", "CD4 Lymph nodes WT", "CD4 Lymph nodes CA", "CD4 Lymph nodes CAKR")
# Attach metadata sent for sequencing
chao_tra2 <- left_join(chao_tra2, md) %>% select(value = Estimator, Cell_type, Organ, Mouse_strain) %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain)) ## Error in `left_join()`:
## ! `by` must be supplied when `x` and `y` have no common variables.
## ℹ use by = character()` to perform a cross-join.
chao_tra2 %>% filter(Cell_type == "CD4") %>%
ggplot(aes(y = value, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
ggtitle("CD4 TRA") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")## Error in `filter()`:
## ! Problem while computing `..1 = Cell_type == "CD4"`.
## Caused by error in `mask$eval_all_filter()`:
## ! object 'Cell_type' not found
#ggsave("./final_fig/diversity/chao_cd4_tra.png", width = 8, height = 10, units = "cm")
#ggsave("./final_fig/diversity/chao_cd4_tra.eps", width = 8, height = 10, units = "cm")
levels_cd8 <- c("CD8 Thymus WT", "CD8 Thymus CA", "CD8 Thymus CAKR", "CD8 Lymph nodes WT", "CD8 Lymph nodes CA", "CD8 Lymph nodes CAKR")
chao_tra2 %>% filter(Cell_type == "CD8") %>%
ggplot(aes(y = value, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
ggtitle("CD8 TRA") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")## Error in `filter()`:
## ! Problem while computing `..1 = Cell_type == "CD8"`.
## Caused by error in `mask$eval_all_filter()`:
## ! object 'Cell_type' not found
#ggsave("./final_fig/diversity/chao_cd8_tra.png", width = 8, height = 10, units = "cm")
#ggsave("./final_fig/diversity/chao_cd8_tra.eps", width = 8, height = 10, units = "cm")chao_trb2 <- chao_trb %>%
rownames_to_column("num_id") %>% mutate(num_id = str_replace_all(string = num_id, c("new_lib_24" = "lib24_S24_L001", "new_lib_25" = "lib25_S25_L001"))) %>%
separate(num_id, into = c("num_id",NA,NA), sep = "_") %>%
mutate(num_id = str_replace(num_id, "lib","")) %>%
mutate_at(vars("num_id"), as.numeric)
levels_cd4 <- c("CD4 Thymus WT", "CD4 Thymus CA", "CD4 Thymus CAKR", "CD4 Lymph nodes WT", "CD4 Lymph nodes CA", "CD4 Lymph nodes CAKR")
chao_trb2 <- left_join(chao_trb2, md) %>% select(value = Estimator, Cell_type, Organ, Mouse_strain) %>%
mutate(sample_type = paste(Cell_type, Organ, Mouse_strain)) ## Error in `left_join()`:
## ! `by` must be supplied when `x` and `y` have no common variables.
## ℹ use by = character()` to perform a cross-join.
chao_trb2 %>% filter(Cell_type == "CD4") %>%
ggplot(aes(y = value, x = factor(sample_type, levels = levels_cd4))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
ggtitle("CD4 TRB") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,125500)) + xlab("")## Error in `filter()`:
## ! Problem while computing `..1 = Cell_type == "CD4"`.
## Caused by error in `mask$eval_all_filter()`:
## ! object 'Cell_type' not found
#ggsave("./final_fig/diversity/chao_cd4_trb.png", width = 8, height = 10, units = "cm")
#ggsave("./final_fig/diversity/chao_cd4_trb.eps", width = 8, height = 10, units = "cm")
levels_cd8 <- c("CD8 Thymus WT", "CD8 Thymus CA", "CD8 Thymus CAKR", "CD8 Lymph nodes WT", "CD8 Lymph nodes CA", "CD8 Lymph nodes CAKR")
chao_trb2 %>% filter(Cell_type == "CD8") %>%
ggplot(aes(y = value, x = factor(sample_type, levels = levels_cd8))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
ggtitle("CD8 TRB") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,125500)) + xlab("")## Error in `filter()`:
## ! Problem while computing `..1 = Cell_type == "CD8"`.
## Caused by error in `mask$eval_all_filter()`:
## ! object 'Cell_type' not found
#ggsave("./final_fig/diversity/chao_cd8_trb.png", width = 8, height = 10, units = "cm")
#ggsave("./final_fig/diversity/chao_cd8_trb.eps", width = 8, height = 10, units = "cm")chao_trb2 %>%
ggplot(aes(y = value, x = factor(Mouse_strain, levels = c("WT","CA","CAKR")))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.1), size = 2, aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")## Error in factor(Mouse_strain, levels = c("WT", "CA", "CAKR")): object 'Mouse_strain' not found
#ggsave("./final_fig/diversity/chao_all_trb.png", width = 8, height = 6, units = "cm")
#ggsave("./final_fig/diversity/chao_all_trb.eps", width = 8, height = 6, units = "cm")
chao_tra2 %>%
ggplot(aes(y = value, x = factor(Mouse_strain, levels = c("WT","CA","CAKR")))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.1), size = 2, aes(color = Mouse_strain)) +
ylab("Chao index estimate") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")## Error in factor(Mouse_strain, levels = c("WT", "CA", "CAKR")): object 'Mouse_strain' not found
#ggsave("./final_fig/diversity/chao_all_tra.png", width = 8, height = 6, units = "cm")
#ggsave("./final_fig/diversity/chao_all_tra.eps", width = 8, height = 6, units = "cm")topclones <- prop.table.trb_filter %>% mutate(topclones_ln_cd4 = (`CD4 Lymph nodes WT Exp02`+
`CD4 Lymph nodes WT Exp03`)/2,
topclones_ln_cd8 = (`CD8 Lymph nodes WT Exp01`+
`CD8 Lymph nodes WT Exp02`+
`CD8 Lymph nodes WT Exp03`)/3)
cd4_topclones_20sequences_trb <- pull(topclones %>% slice_max(order_by = topclones_ln_cd4, n = 20), aaSeqCDR3)
cd8_topclones_20sequences_trb <- pull(topclones %>% slice_max(order_by = topclones_ln_cd8, n = 20), aaSeqCDR3)topclones <- prop.table.tra_filter %>% mutate(topclones_ln_cd4 = (`CD4 Lymph nodes WT Exp02`+
`CD4 Lymph nodes WT Exp03`)/2,
topclones_ln_cd8 = (`CD8 Lymph nodes WT Exp01`+
`CD8 Lymph nodes WT Exp02`+
`CD8 Lymph nodes WT Exp03`)/3)
cd4_topclones_20sequences_tra <- pull(topclones %>% slice_max(order_by = topclones_ln_cd4, n = 20), aaSeqCDR3)
cd8_topclones_20sequences_tra <- pull(topclones %>% slice_max(order_by = topclones_ln_cd8, n = 20), aaSeqCDR3)cd4_topclones_rep_pct <- prop.table.tra2 %>% filter(aaSeqCDR3 %in% cd4_topclones_20sequences_tra) %>% select(aaSeqCDR3, starts_with("CD4")) %>% select(1,14,15,9:13,7,8,2:6)
cd4_topclones_rep_pct2 <- as.data.frame(colSums(cd4_topclones_rep_pct[,2:15]))
colnames(cd4_topclones_rep_pct2) <- "value"
cd4_topclones_rep_pct2 <- cd4_topclones_rep_pct2 %>%
rownames_to_column("sample") %>%
mutate(sample = stringr::str_replace_all(sample, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp")) %>%
mutate(Cell_type_organ = factor(paste(Organ, Strain), levels = c("Thymus WT","Thymus CA", "Thymus CAKR",
"LN WT","LN CA", "LN CAKR" )))
cd4_topclones_rep_pct2## Cell_type Organ Strain Exp value Cell_type_organ
## 1 CD4 Thymus WT Exp02 0.02719600 Thymus WT
## 2 CD4 Thymus WT Exp03 0.02768635 Thymus WT
## 3 CD4 Thymus CA Exp01 0.02961628 Thymus CA
## 4 CD4 Thymus CA Exp03 0.02546552 Thymus CA
## 5 CD4 Thymus CAKR Exp01 0.02006336 Thymus CAKR
## 6 CD4 Thymus CAKR Exp02 0.02571295 Thymus CAKR
## 7 CD4 Thymus CAKR Exp03 0.02394512 Thymus CAKR
## 8 CD4 LN WT Exp02 0.03291915 LN WT
## 9 CD4 LN WT Exp03 0.03092514 LN WT
## 10 CD4 LN CA Exp01 0.03194544 LN CA
## 11 CD4 LN CA Exp02 0.03571429 LN CA
## 12 CD4 LN CA Exp03 0.03472770 LN CA
## 13 CD4 LN CAKR Exp02 0.03056850 LN CAKR
## 14 CD4 LN CAKR Exp03 0.02854483 LN CAKR
cd4_topclones_rep_pct2 %>%
filter(Organ == "LN") %>%
ggplot(aes(y = value*100, x = factor(Cell_type_organ))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Strain)) +
ylab("% of repertoire") +
ggtitle("CD4 TRA") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")#ggsave("./final_fig/20topseqs/20topseqs_pct_tra_cd4.png", width = 5, height = 5.7, units = "cm")
#ggsave("./final_fig/20topseqs/20topseqs_pct_tra_cd4.eps", width = 5, height = 5.7, units = "cm")cd4_topclones_rep_pct <- prop.table.trb2 %>% filter(aaSeqCDR3 %in% cd4_topclones_20sequences_trb) %>% select(aaSeqCDR3, starts_with("CD4")) %>% select(1,14,15,9:13,7,8,2:6)
cd4_topclones_rep_pct2 <- as.data.frame(colSums(cd4_topclones_rep_pct[,2:15]))
colnames(cd4_topclones_rep_pct2) <- "value"
cd4_topclones_rep_pct2 <- cd4_topclones_rep_pct2 %>%
rownames_to_column("sample") %>%
mutate(sample = stringr::str_replace_all(sample, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp")) %>%
mutate(Cell_type_organ = factor(paste(Organ, Strain), levels = c("Thymus WT","Thymus CA", "Thymus CAKR",
"LN WT","LN CA", "LN CAKR" )))
cd4_topclones_rep_pct2## Cell_type Organ Strain Exp value Cell_type_organ
## 1 CD4 Thymus WT Exp02 0.002983516 Thymus WT
## 2 CD4 Thymus WT Exp03 0.003173193 Thymus WT
## 3 CD4 Thymus CA Exp01 0.002918556 Thymus CA
## 4 CD4 Thymus CA Exp03 0.002984742 Thymus CA
## 5 CD4 Thymus CAKR Exp01 0.002182929 Thymus CAKR
## 6 CD4 Thymus CAKR Exp02 0.001994960 Thymus CAKR
## 7 CD4 Thymus CAKR Exp03 0.002340585 Thymus CAKR
## 8 CD4 LN WT Exp02 0.007174320 LN WT
## 9 CD4 LN WT Exp03 0.004953674 LN WT
## 10 CD4 LN CA Exp01 0.004225770 LN CA
## 11 CD4 LN CA Exp02 0.004535615 LN CA
## 12 CD4 LN CA Exp03 0.005438813 LN CA
## 13 CD4 LN CAKR Exp02 0.003940359 LN CAKR
## 14 CD4 LN CAKR Exp03 0.003868959 LN CAKR
cd4_topclones_rep_pct2 %>%
filter(Organ == "LN") %>%
ggplot(aes(y = value*100, x = factor(Cell_type_organ))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Strain)) +
ylab("% of repertoire") +
ggtitle("CD4 TRB") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")#ggsave("./final_fig/20topseqs/20topseqs_pct_trb_cd4.png", width = 5.4, height = 5.7, units = "cm")
#ggsave("./final_fig/20topseqs/20topseqs_pct_trb_cd4.eps", width = 5.4, height = 5.7, units = "cm")cd8_topclones_rep_pct <- prop.table.tra2 %>% filter(aaSeqCDR3 %in% cd8_topclones_20sequences_tra) %>% select(aaSeqCDR3, starts_with("CD8")) %>% select(1,13:15,9:12,6:8,2:5)
cd8_topclones_rep_pct2 <- as.data.frame(colSums(cd8_topclones_rep_pct[,2:15]))
colnames(cd8_topclones_rep_pct2) <- "value"
cd8_topclones_rep_pct2 <- cd8_topclones_rep_pct2 %>%
rownames_to_column("sample") %>%
mutate(sample = stringr::str_replace_all(sample, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp")) %>%
mutate(Cell_type_organ = factor(paste(Organ, Strain), levels = c("Thymus WT","Thymus CA", "Thymus CAKR",
"LN WT","LN CA", "LN CAKR" )))
cd8_topclones_rep_pct2## Cell_type Organ Strain Exp value Cell_type_organ
## 1 CD8 Thymus WT Exp01 0.03400861 Thymus WT
## 2 CD8 Thymus WT Exp02 0.02653846 Thymus WT
## 3 CD8 Thymus WT Exp03 0.03137384 Thymus WT
## 4 CD8 Thymus CA Exp01 0.02447936 Thymus CA
## 5 CD8 Thymus CA Exp03 0.02870174 Thymus CA
## 6 CD8 Thymus CAKR Exp02 0.03063281 Thymus CAKR
## 7 CD8 Thymus CAKR Exp03 0.02280437 Thymus CAKR
## 8 CD8 LN WT Exp01 0.03753115 LN WT
## 9 CD8 LN WT Exp02 0.03764212 LN WT
## 10 CD8 LN WT Exp03 0.03642344 LN WT
## 11 CD8 LN CA Exp02 0.03592586 LN CA
## 12 CD8 LN CA Exp03 0.03213014 LN CA
## 13 CD8 LN CAKR Exp02 0.02946860 LN CAKR
## 14 CD8 LN CAKR Exp03 0.03059997 LN CAKR
cd8_topclones_rep_pct2 %>%
filter(Organ == "LN") %>%
ggplot(aes(y = value*100, x = factor(Cell_type_organ))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Strain)) +
ylab("% of repertoire") +
ggtitle("CD8 TRA") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")#ggsave("./final_fig/20topseqs/20topseqs_pct_tra_cd8.png", width = 5, height = 5.7, units = "cm")
#ggsave("./final_fig/20topseqs/20topseqs_pct_tra_cd8.eps", width = 5, height = 5.7, units = "cm")cd8_topclones_rep_pct <- prop.table.trb2 %>% filter(aaSeqCDR3 %in% cd8_topclones_20sequences_trb) %>% select(aaSeqCDR3, starts_with("CD8")) %>% select(1,13:15,9:12,6:8,2:5)
cd8_topclones_rep_pct2 <- as.data.frame(colSums(cd8_topclones_rep_pct[,2:15]))
colnames(cd8_topclones_rep_pct2) <- "value"
cd8_topclones_rep_pct2 <- cd8_topclones_rep_pct2 %>%
rownames_to_column("sample") %>%
mutate(sample = stringr::str_replace_all(sample, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample, into = c("Cell_type","Organ","Strain","Exp")) %>%
mutate(Cell_type_organ = factor(paste(Organ, Strain), levels = c("Thymus WT","Thymus CA", "Thymus CAKR",
"LN WT","LN CA", "LN CAKR" )))
cd8_topclones_rep_pct2## Cell_type Organ Strain Exp value Cell_type_organ
## 1 CD8 Thymus WT Exp01 0.004631440 Thymus WT
## 2 CD8 Thymus WT Exp02 0.004479484 Thymus WT
## 3 CD8 Thymus WT Exp03 0.004929262 Thymus WT
## 4 CD8 Thymus CA Exp01 0.005472183 Thymus CA
## 5 CD8 Thymus CA Exp03 0.004517467 Thymus CA
## 6 CD8 Thymus CAKR Exp02 0.003983029 Thymus CAKR
## 7 CD8 Thymus CAKR Exp03 0.005094131 Thymus CAKR
## 8 CD8 LN WT Exp01 0.007470700 LN WT
## 9 CD8 LN WT Exp02 0.006188663 LN WT
## 10 CD8 LN WT Exp03 0.007226656 LN WT
## 11 CD8 LN CA Exp02 0.006775749 LN CA
## 12 CD8 LN CA Exp03 0.007135684 LN CA
## 13 CD8 LN CAKR Exp02 0.007553967 LN CAKR
## 14 CD8 LN CAKR Exp03 0.006905450 LN CAKR
cd8_topclones_rep_pct2 %>%
filter(Organ == "LN") %>%
ggplot(aes(y = value*100, x = factor(Cell_type_organ))) +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
geom_point(aes(color = Strain)) +
ylab("% of repertoire") +
ggtitle("CD8 TRB") +
theme(axis.text.x = element_text(angle = 90)) +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,NA)) + xlab("")#ggsave("./final_fig/20topseqs/20topseqs_pct_trb_cd8.png", width = 5.4, height = 5.7, units = "cm")
#ggsave("./final_fig/20topseqs/20topseqs_pct_trb_cd8.eps", width = 5.4, height = 5.7, units = "cm")# TRA
cd4_topclones_heatmap_matrix <- prop.table.tra2 %>% filter(aaSeqCDR3 %in% cd4_topclones_20sequences_tra) %>% select(aaSeqCDR3, starts_with("CD4")) %>% select(1,14,15,9:13,7,8,2:6)
cd4_topclones_heatmap_matrix2 <- as.matrix(cd4_topclones_heatmap_matrix[,2:15])
rownames(cd4_topclones_heatmap_matrix2) <- cd4_topclones_heatmap_matrix$aaSeqCDR3
cd4_topclones_heatmap_matrix2 <- cd4_topclones_heatmap_matrix2[match( cd4_topclones_20sequences_tra, rownames(cd4_topclones_heatmap_matrix2)),]
pheatmap(cd4_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)#ggsave(plot = pheatmap(cd4_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F), "final_fig/heatmap_top20_ordered/cd4_tra_noscale.svg")# TRA
cd8_topclones_heatmap_matrix <- prop.table.tra2 %>% filter(aaSeqCDR3 %in% cd8_topclones_20sequences_tra) %>% select(aaSeqCDR3, starts_with("CD8")) %>% select(1,13:15,9:12,6:8,2:5)
cd8_topclones_heatmap_matrix2 <- as.matrix(cd8_topclones_heatmap_matrix[,2:15])
rownames(cd8_topclones_heatmap_matrix2) <- cd8_topclones_heatmap_matrix$aaSeqCDR3
cd8_topclones_heatmap_matrix2 <- cd8_topclones_heatmap_matrix2[match( cd8_topclones_20sequences_tra, rownames(cd8_topclones_heatmap_matrix2)),]
pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)#ggsave(plot = pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F), "final_fig/heatmap_top20_ordered/cd8_tra_noscale.svg")without 1st sequence
### without 1st sequence
cd8_topclones_heatmap_matrix2 <- as.matrix(cd8_topclones_heatmap_matrix[2:20,2:15])
rownames(cd8_topclones_heatmap_matrix2) <- cd8_topclones_heatmap_matrix$aaSeqCDR3[2:20]
cd8_topclones_heatmap_matrix2 <- cd8_topclones_heatmap_matrix2[match( cd8_topclones_20sequences_tra[2:20], rownames(cd8_topclones_heatmap_matrix2)),]
pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)#ggsave(plot = pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F), "final_fig/heatmap_top20_ordered/cd8_tra_noscale_without.svg")# trb
cd4_topclones_heatmap_matrix <- prop.table.trb2 %>% filter(aaSeqCDR3 %in% cd4_topclones_20sequences_trb) %>% select(aaSeqCDR3, starts_with("CD4")) %>% select(1,14,15,9:13,7,8,2:6)
cd4_topclones_heatmap_matrix2 <- as.matrix(cd4_topclones_heatmap_matrix[,2:15])
rownames(cd4_topclones_heatmap_matrix2) <- cd4_topclones_heatmap_matrix$aaSeqCDR3
cd4_topclones_heatmap_matrix2 <- cd4_topclones_heatmap_matrix2[match( cd4_topclones_20sequences_trb, rownames(cd4_topclones_heatmap_matrix2)),]
pheatmap(cd4_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)#ggsave(plot = pheatmap(cd4_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F), "final_fig/heatmap_top20_ordered/cd4_trb_noscale.svg")# trb
cd8_topclones_heatmap_matrix <- prop.table.trb2 %>% filter(aaSeqCDR3 %in% cd8_topclones_20sequences_trb) %>% select(aaSeqCDR3, starts_with("CD8")) %>% select(1,13:15,9:12,6:8,2:5)
cd8_topclones_heatmap_matrix2 <- as.matrix(cd8_topclones_heatmap_matrix[,2:15])
rownames(cd8_topclones_heatmap_matrix2) <- cd8_topclones_heatmap_matrix$aaSeqCDR3
cd8_topclones_heatmap_matrix2 <- cd8_topclones_heatmap_matrix2[match( cd8_topclones_20sequences_trb, rownames(cd8_topclones_heatmap_matrix2)),]
pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)# png("final_fig/heatmap_top20_ordered/cd8_trb_noscale.png", width = 400, height = 500)
# pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F)
# dev.off()
#ggsave(plot = pheatmap(cd8_topclones_heatmap_matrix2, scale = "none", cluster_rows = F, cluster_cols = F), "final_fig/heatmap_top20_ordered/cd8_trb_noscale.svg")binary_tra <- excel_count_table_tra %>%
mutate(nkt_trav11_traj18 = if_else(
(grepl(allVHitsWithScore, pattern = "TRAV11") |
(grepl(allJHitsWithScore, pattern = "TRAJ18"))),"yes","no")) %>%
filter(nkt_trav11_traj18 == "no") %>%
mutate_at(vars(starts_with("CD")), .funs = binary)
# count_overlap <- function(df, column_number){
# repertoire <- df %>% select(vars(column_number), aaSeqCDR3) %>% filter(vars(column_number)>0) %>% pull("aaSeqCDR3")
# nrow_column <- length(repertoire)
# intersect_rept <- length(intersect(repertoire, most_diverse_repertoire))
# pct <- intersect_rept/length()
# return(x)
# }
binary_tra_longer <- binary_tra %>% select(starts_with("CD"), aaSeqCDR3) %>% pivot_longer(!aaSeqCDR3, names_to = "num_id")
df_all4 <- data.frame("")
for(j in levels(factor(binary_tra_longer$num_id))){
subset1 <- binary_tra_longer %>% filter(num_id == j & value>0) %>% pull("aaSeqCDR3")
vector_overlap <- c()
for(i in levels(factor(binary_tra_longer$num_id))){
subset2 <- binary_tra_longer %>% filter(num_id == i & value>0) %>% pull("aaSeqCDR3")
intersect_rep <- length(intersect(subset1, subset2))
total <- length(subset2)
vector_overlap <- c(vector_overlap,intersect_rep/total)
}
df <- as.data.frame(x = vector_overlap)
colnames(df) <- j
df
df_all4 <- cbind(df_all4, df)
}
df_all4 <- df_all4[,2:29]
rownames(df_all4) <- colnames(df_all4)
#write.csv(df_all4, "rep_overlap_tra.csv")df24 <- df_all4
df24[df24 == 1] <- 0
### plot dotplot of overlaps
df25 <- df24 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df25, aes(id, factor(name, levels = rev(levels(factor(name)))))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(7,12)) +
geom_text(aes(label = round(value*100, digits = 1))) +
scale_colour_gradient2(low = "lightskyblue", mid = "lightsteelblue2", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) rep_overlap_tra <- read_csv("rep_overlap_tra.csv") %>% column_to_rownames("...1")
rep_overlap_tra_cd4 <- rep_overlap_tra[1:14,1:14]
rep_overlap_tra_cd8 <- rep_overlap_tra[15:28,15:28]
rep_overlap_tra_cd4 <- rep_overlap_tra_cd4[ rev(c(5, 4, 3, 2, 1, 7, 6, 12, 11, 10, 9, 8, 14, 13)),
rev(c(5, 4, 3, 2, 1, 7, 6, 12, 11, 10, 9, 8, 14, 13))]
levels_names <- colnames(rep_overlap_tra_cd4)
df_overlap_tra_cd4 <- rep_overlap_tra_cd4
df_overlap_tra_cd4[df_overlap_tra_cd4 == 1] <- 0
### plot dotplot of overlaps
df_overlap_tra_cd4 <- df_overlap_tra_cd4 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df_overlap_tra_cd4, aes(x = factor(name, levels = levels_names), y = factor(id, levels = rev(levels_names)))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(5,9)) +
geom_text(aes(label = round(value*100, digits = 1)), size = 3) +
scale_colour_gradient2(low = "white", mid = "lightsteelblue", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) + ggtheme()#ggsave(file = "./final_fig/overlap_cd4_tra.png", width = 20, height = 16, units = "cm")
#ggsave(file = "./final_fig/overlap_cd4_tra.svg", width = 20, height = 16, units = "cm")rep_overlap_tra_cd8 <- rep_overlap_tra_cd8[rev(c(12:14,8:11,5:7,1:4)),rev(c(12:14,8:11,5:7,1:4))]
levels_names <- rev(colnames(rep_overlap_tra_cd8))
df_overlap_tra_cd8 <- rep_overlap_tra_cd8
df_overlap_tra_cd8[df_overlap_tra_cd8 == 1] <- 0
### plot dotplot of overlaps
df_overlap_tra_cd8 <- df_overlap_tra_cd8 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df_overlap_tra_cd8, aes(factor(name, levels = levels_names), factor(id, levels = rev(levels_names)))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(5,9)) +
geom_text(aes(label = round(value*100, digits = 1)), size = 3) +
scale_colour_gradient2(low = "white", mid = "lightsteelblue", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) + ggtheme()#ggsave(file = "./final_fig/overlap_cd8_tra.png", width = 20, height = 16, units = "cm")
#ggsave(file = "./final_fig/overlap_cd8_tra.svg", width = 20, height = 16, units = "cm")violin_tra_cd4 <- rep_overlap_tra_cd4 %>%
rownames_to_column("sample1") %>%
pivot_longer(!sample1, names_to = "sample2") %>%
mutate(sample2 = stringr::str_replace_all(sample2, pattern = "Lymph nodes", replacement = "LN")) %>%
mutate(sample1 = stringr::str_replace_all(sample1, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample1, into = c("Cell_type1","Organ1","Strain1","Exp1"), remove = F) %>%
separate(sample2, into = c("Cell_type2","Organ2","Strain2","Exp2"), remove = F) %>%
filter(value<1)
violin_tra_cd8 <- rep_overlap_tra_cd8 %>%
rownames_to_column("sample1") %>%
pivot_longer(!sample1, names_to = "sample2") %>%
mutate(sample2 = stringr::str_replace_all(sample2, pattern = "Lymph nodes", replacement = "LN")) %>%
mutate(sample1 = stringr::str_replace_all(sample1, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample1, into = c("Cell_type1","Organ1","Strain1","Exp1"), remove = F) %>%
separate(sample2, into = c("Cell_type2","Organ2","Strain2","Exp2"), remove = F) %>%
filter(value<1)violin_tra_cd4 %>% filter(Organ1 == "Thymus" & Organ2 == "Thymus") %>%
mutate(organ_strain1 = paste(Organ1, Strain1),
organ_strain2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strain1, organ_strain2, sep = " - ")) %>%
filter(comparison %in% c("Thymus WT - Thymus WT", "Thymus CA - Thymus WT", "Thymus CAKR - Thymus WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD4 TRA Thymus") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.35)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd4_tra_thy.png", width = 9.5, height = 9.5, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd4_tra_thy.eps", width = 9.5, height = 9.5, units = "cm")violin_tra_cd4 %>% filter(Organ1 == "LN" & Organ2 == "LN") %>%
mutate(organ_strain1 = paste(Organ1, Strain1),
organ_strain2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strain1, organ_strain2, sep = " - ")) %>%
filter(comparison %in% c("LN WT - LN WT", "LN CA - LN WT", "LN CAKR - LN WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD4 TRA LN") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.35)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd4_tra_LN.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd4_tra_LN.eps", width = 8, height = 8, units = "cm")violin_tra_cd8 %>% filter(Organ1 == "Thymus" & Organ2 == "Thymus") %>%
mutate(organ_strain1 = paste(Organ1, Strain1),
organ_strain2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strain1, organ_strain2, sep = " - ")) %>%
filter(comparison %in% c("Thymus WT - Thymus WT", "Thymus CA - Thymus WT", "Thymus CAKR - Thymus WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD8 TRA Thymus") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.35)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd8_tra_thy.png", width = 9.5, height = 9.5, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd8_tra_thy.eps", width = 9.5, height = 9.5, units = "cm")violin_tra_cd8 %>% filter(Organ1 == "LN" & Organ2 == "LN") %>%
mutate(organ_strain1 = paste(Organ1, Strain1),
organ_strain2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strain1, organ_strain2, sep = " - ")) %>%
filter(comparison %in% c("LN WT - LN WT", "LN CA - LN WT", "LN CAKR - LN WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD8 TRA LN") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.35)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd8_tra_LN.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd8_tra_LN.eps", width = 8, height = 8, units = "cm")binary_trb <- excel_count_table_trb %>%
mutate_at(vars(starts_with("CD")), .funs = binary)
binary_trb_longer <- binary_trb %>% select(starts_with("CD"), aaSeqCDR3) %>% pivot_longer(!aaSeqCDR3, names_to = "num_id")
df_all4 <- data.frame("")
for(j in levels(factor(binary_trb_longer$num_id))){
subset1 <- binary_trb_longer %>% filter(num_id == j & value>0) %>% pull("aaSeqCDR3")
vector_overlap <- c()
for(i in levels(factor(binary_trb_longer$num_id))){
subset2 <- binary_trb_longer %>% filter(num_id == i & value>0) %>% pull("aaSeqCDR3")
intersect_rep <- length(intersect(subset1, subset2))
total <- length(subset2)
vector_overlap <- c(vector_overlap,intersect_rep/total)
}
df <- as.data.frame(x = vector_overlap)
colnames(df) <- j
df
df_all4 <- cbind(df_all4, df)
}
df_all4 <- df_all4[,2:29]
rownames(df_all4) <- colnames(df_all4)
#write.csv(df_all4, "rep_overlap_trb.csv")df24 <- df_all4
df24[df24 == 1] <- 0
### plot dotplot of overlaps
df25 <- df24 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df25, aes(id, factor(name, levels = rev(levels(factor(name)))))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(7,12)) +
geom_text(aes(label = round(value*100, digits = 1))) +
scale_colour_gradient2(low = "lightskyblue", mid = "lightsteelblue2", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) rep_overlap_trb <- read_csv("rep_overlap_trb.csv") %>% column_to_rownames("...1")
rep_overlap_trb_cd4 <- rep_overlap_trb[1:14,1:14]
rep_overlap_trb_cd8 <- rep_overlap_trb[15:28,15:28]
rep_overlap_trb_cd4 <- rep_overlap_trb_cd4[rev(c(13,14,8:12,6,7,1:5)),rev(c(13,14,8:12,6,7,1:5))]
levels_names <- rev(colnames(rep_overlap_trb_cd4))
df_overlap_trb_cd4 <- rep_overlap_trb_cd4
df_overlap_trb_cd4[df_overlap_trb_cd4 == 1] <- 0
### plot dotplot of overlaps
df_overlap_trb_cd4 <- df_overlap_trb_cd4 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df_overlap_trb_cd4, aes(factor(name, levels = levels_names), factor(id, levels = rev(levels_names)))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(5,9)) +
geom_text(aes(label = round(value*100, digits = 1)), size = 3) +
scale_colour_gradient2(low = "white", mid = "lightsteelblue", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) + ggtheme()#ggsave(file = "./final_fig/overlap_cd4_trb.png", width = 20, height = 16, units = "cm")
#ggsave(file = "./final_fig/overlap_cd4_trb.svg", width = 20, height = 16, units = "cm")rep_overlap_trb_cd8 <- rep_overlap_trb_cd8[rev(c(12:14,8:11,5:7,1:4)),rev(c(12:14,8:11,5:7,1:4))]
levels_names <- rev(colnames(rep_overlap_trb_cd8))
df_overlap_trb_cd8 <- rep_overlap_trb_cd8
df_overlap_trb_cd8[df_overlap_trb_cd8 == 1] <- 0
### plot dotplot of overlaps
df_overlap_trb_cd8 <- df_overlap_trb_cd8 %>% rownames_to_column("id") %>% pivot_longer(-id)
g2 <- ggplot(df_overlap_trb_cd8, aes(factor(name, levels = levels_names), factor(id, levels = rev(levels_names)))) +
geom_point(aes(size = value*100, colour = value*100)) +
theme_bw()
g2 + scale_size_continuous(range=c(5,9)) +
geom_text(aes(label = round(value*100, digits = 1)), size = 3) +
scale_colour_gradient2(low = "white", mid = "lightsteelblue", high = "salmon") +
theme(axis.text.x = element_text(angle = 90)) + ggtheme()#ggsave(file = "./final_fig/overlap_cd8_trb.png", width = 20, height = 16, units = "cm")
#ggsave(file = "./final_fig/overlap_cd8_trb.svg", width = 20, height = 16, units = "cm")violin_trb_cd4 <- rep_overlap_trb_cd4 %>%
rownames_to_column("sample1") %>%
pivot_longer(!sample1, names_to = "sample2") %>%
mutate(sample2 = stringr::str_replace_all(sample2, pattern = "Lymph nodes", replacement = "LN")) %>%
mutate(sample1 = stringr::str_replace_all(sample1, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample1, into = c("Cell_type1","Organ1","Strain1","Exp1"), remove = F) %>%
separate(sample2, into = c("Cell_type2","Organ2","Strain2","Exp2"), remove = F) %>%
filter(value<1)
violin_trb_cd8 <- rep_overlap_trb_cd8 %>%
rownames_to_column("sample1") %>%
pivot_longer(!sample1, names_to = "sample2") %>%
mutate(sample2 = stringr::str_replace_all(sample2, pattern = "Lymph nodes", replacement = "LN")) %>%
mutate(sample1 = stringr::str_replace_all(sample1, pattern = "Lymph nodes", replacement = "LN")) %>%
separate(sample1, into = c("Cell_type1","Organ1","Strain1","Exp1"), remove = F) %>%
separate(sample2, into = c("Cell_type2","Organ2","Strain2","Exp2"), remove = F) %>%
filter(value<1)violin_trb_cd8 %>% filter(Organ1 == "Thymus" & Organ2 == "Thymus") %>%
mutate(organ_strbin1 = paste(Organ1, Strain1),
organ_strbin2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strbin1, organ_strbin2, sep = " - ")) %>%
filter(comparison %in% c("Thymus WT - Thymus WT", "Thymus CA - Thymus WT", "Thymus CAKR - Thymus WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD8 TRB Thymus") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.25)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd8_trb_thy.png", width = 9.5, height = 9.5, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd8_trb_thy.eps", width = 9.5, height = 9.5, units = "cm")violin_trb_cd8 %>% filter(Organ1 == "LN" & Organ2 == "LN") %>%
mutate(organ_strbin1 = paste(Organ1, Strain1),
organ_strbin2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strbin1, organ_strbin2, sep = " - ")) %>%
filter(comparison %in% c("LN WT - LN WT", "LN CA - LN WT", "LN CAKR - LN WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD8 TRB LN") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.25)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd8_trb_LN.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd8_trb_LN.eps", width = 8, height = 8, units = "cm")violin_trb_cd4 %>% filter(Organ1 == "Thymus" & Organ2 == "Thymus") %>%
mutate(organ_strbin1 = paste(Organ1, Strain1),
organ_strbin2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strbin1, organ_strbin2, sep = " - ")) %>%
filter(comparison %in% c("Thymus WT - Thymus WT", "Thymus CA - Thymus WT", "Thymus CAKR - Thymus WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD4 TRB Thymus") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.20)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd4_trb_thy.png", width = 9.5, height = 9.5, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd4_trb_thy.eps", width = 9.5, height = 9.5, units = "cm")violin_trb_cd4 %>% filter(Organ1 == "LN" & Organ2 == "LN") %>%
mutate(organ_strbin1 = paste(Organ1, Strain1),
organ_strbin2 = paste(Organ2, Strain2)) %>%
mutate(comparison = paste(organ_strbin1, organ_strbin2, sep = " - ")) %>%
filter(comparison %in% c("LN WT - LN WT", "LN CA - LN WT", "LN CAKR - LN WT")) %>%
ggplot(aes(x = comparison, y = value)) +
geom_dotplot(binaxis='y', stackdir='center', dotsize=0) +
geom_jitter(position=position_jitter(0.2), size = 2, aes(color = comparison)) +
ggtitle("CD4 TRB LN") +
stat_summary(fun = "mean",
geom = "crossbar",
width = 0.5,
colour = "black") +
ylab("% repertoire overlap") +
theme_classic() +
ggtheme() +
theme(axis.text.x = element_text(angle = 90)) +
scale_color_manual(values = c("dodgerblue","indianred2","gray40")) +
ylim(c(0,0.20)) + xlab("")#ggsave("./final_fig/overlaps/rep_overlap_cd4_trb_LN.png", width = 8, height = 8, units = "cm")
#ggsave("./final_fig/overlaps/rep_overlap_cd4_trb_LN.eps", width = 8, height = 8, units = "cm")sessionInfo()## R version 4.2.1 (2022-06-23)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] SmartEDA_0.3.8 kableExtra_1.3.4 factoextra_1.0.7 pheatmap_1.0.12
## [5] immunarch_0.6.9 data.table_1.14.2 dtplyr_1.2.2 patchwork_1.1.2
## [9] cowplot_1.1.1 readxl_1.4.1 forcats_0.5.2 stringr_1.4.1
## [13] dplyr_1.0.9 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
## [17] tibble_3.1.8 ggplot2_3.3.6 tidyverse_1.3.1
##
## loaded via a namespace (and not attached):
## [1] uuid_1.1-0 backports_1.4.1 circlize_0.4.15
## [4] fastmatch_1.1-3 systemfonts_1.0.4 plyr_1.8.7
## [7] igraph_1.3.4 digest_0.6.29 foreach_1.5.2
## [10] htmltools_0.5.3 ggalluvial_0.12.3 fansi_1.0.3
## [13] magrittr_2.0.3 cluster_2.1.4 doParallel_1.0.17
## [16] tzdb_0.3.0 modelr_0.1.9 vroom_1.5.7
## [19] svglite_2.1.0 lpSolve_5.6.15 rmdformats_1.0.4
## [22] colorspace_2.0-3 rvest_1.0.3 ggrepel_0.9.1
## [25] haven_2.5.1 xfun_0.31 crayon_1.5.1
## [28] jsonlite_1.8.0 phangorn_2.9.0 iterators_1.0.14
## [31] ape_5.6-2 glue_1.6.2 gtable_0.3.0
## [34] webshot_0.5.3 UpSetR_1.4.0 car_3.1-0
## [37] kernlab_0.9-31 shape_1.4.6 prabclus_2.3-2
## [40] DEoptimR_1.0-11 ISLR_1.4 abind_1.4-5
## [43] scales_1.2.1 DBI_1.1.3 GGally_2.1.2
## [46] rstatix_0.7.0 Rcpp_1.0.9 viridisLite_0.4.1
## [49] xtable_1.8-4 bit_4.0.4 mclust_5.4.10
## [52] stats4_4.2.1 sampling_2.9 httr_1.4.4
## [55] RColorBrewer_1.1-3 fpc_2.2-9 modeltools_0.2-23
## [58] ellipsis_0.3.2 farver_2.1.1 pkgconfig_2.0.3
## [61] reshape_0.8.9 flexmix_2.3-18 nnet_7.3-17
## [64] sass_0.4.2 ggseqlogo_0.1 dbplyr_2.2.1
## [67] utf8_1.2.2 labeling_0.4.2 tidyselect_1.1.2
## [70] rlang_1.0.4 reshape2_1.4.4 later_1.3.0
## [73] munsell_0.5.0 cellranger_1.1.0 tools_4.2.1
## [76] cachem_1.0.6 cli_3.3.0 generics_0.1.3
## [79] broom_1.0.0 evaluate_0.16 fastmap_1.1.0
## [82] yaml_2.3.5 bit64_4.0.5 knitr_1.39
## [85] fs_1.5.2 robustbase_0.95-0 nlme_3.1-159
## [88] mime_0.12 xml2_1.3.3 compiler_4.2.1
## [91] shinythemes_1.2.0 rstudioapi_0.14 ggsignif_0.6.3
## [94] reprex_2.0.2 bslib_0.4.0 stringi_1.7.8
## [97] highr_0.9 lattice_0.20-45 Matrix_1.4-1
## [100] vctrs_0.4.1 stringdist_0.9.8 pillar_1.8.1
## [103] lifecycle_1.0.1 jquerylib_0.1.4 GlobalOptions_0.1.2
## [106] httpuv_1.6.5 R6_2.5.1 bookdown_0.27
## [109] promises_1.2.0.1 gridExtra_2.3 codetools_0.2-18
## [112] MASS_7.3-58.1 assertthat_0.2.1 withr_2.5.0
## [115] rlist_0.4.6.2 diptest_0.76-0 parallel_4.2.1
## [118] hms_1.1.2 quadprog_1.5-8 grid_4.2.1
## [121] class_7.3-20 rmarkdown_2.14 carData_3.0-5
## [124] ggpubr_0.4.0 shiny_1.7.2 lubridate_1.8.0